Software

The Digital Projects Unit uses a variety of digitization software in daily operations.  The software tools listed below are generally available tools and are not directly tied to a single scanner or processing platform.

Imaging Software

Software Name:  ACDSee  
Software Manufacturer:  ACD Systems 
How we use it:  We employ ACDSee for image quality control and batch renaming functionality. 

Software Name:  Adobe Photoshop  
Software Manufacturer:   Adobe   
How we use it:  We use Adobe Photoshop as a general purpose imaging tool as well as a platform for performing quality control and batch processing operations.  Each of our workstations is equipped with this software, and it is used in almost every imaging workflow. 

Software Name:  ImageMagick 
Software Manufacturer:  Open Source Software
How we use it:  ImageMagick and its individual tools such as identify, convert, and mogrify are used throughout the Digital Projects Unit workflows as general purpose command line tools for processing, verifying, and manipulating image files. 

Software Name:  Scan Tailor 
Software Manufacturer:  Open Source Software
How we use it:  We use Scan Tailor to deskew, clean up, crop, and resize scanned pages during post-processing.

OCR Software

Software Name:  ABBYY Recognition Server 
Software Manufacturer:  ABBYY
How we use it:  We primarily use the ABBYY OCR software with our newspaper projects.  We use the standard languages as well as old German and Chinese, Korean, and Japanese expansion languages.  We typically operate with automatic segmentation and save the resulting ABBYY XML OCR files.  We supplement the internal dictionaries with local dictionaries of common names and locations from around Texas. Because a single newspaper page may take up to three minutes to OCR, we operate the ABBYY Recognition Server in a multi-node OCR cluster with 48 working cores–allowing us to OCR 48 newspaper pages at once. 

Software Name:   PrimeOCR 
Software Manufacturer:   Prime Recognition   
How we use it:  General purpose OCR engine.  We typically operate with automatic segmentation and save the resulting .pro format OCR files.  We supplement the internal dictionaries with local dictionaries built of common names and locations from around Texas.

Didn’t Find What You Need?

For information about software infrastructure used in our digital collections, see “About the Technology” in The Portal to Texas History or the UNT Digital Library.

For information about Web archiving software, see About Web Archiving.