Software
PAGE CONTENTS 2 minute read.
The Digital Projects Unit uses a variety of digitization software in daily operations. The software tools listed below are generally available tools and are not directly tied to a single scanner or processing platform.
Imaging Software
Software Name: ACDSee
Software Manufacturer: ACD Systems
How we use it: We employ ACDSee for image quality control and batch
renaming functionality.
Software Name: Adobe Photoshop
Software Manufacturer: Adobe
How we use it: We use Adobe Photoshop as a general purpose imaging
tool as well as a platform for performing quality control and batch
processing operations. Each of our workstations is equipped with this
software, and it is used in almost every imaging workflow.
Software Name: ImageMagick
Software Manufacturer: Open Source Software
How we use it: ImageMagick and its individual tools such as
identify, convert, and mogrify are used throughout the Digital Projects
Unit workflows as general purpose command line tools for processing,
verifying, and manipulating image files.
Software Name: Scan Tailor
Software Manufacturer: Open Source Software
How we use it: We use Scan Tailor to deskew, clean up, crop, and
resize scanned pages during post-processing.
OCR Software
Software Name: ABBYY Recognition Server
Software Manufacturer: ABBYY
How we use it: We primarily use the ABBYY OCR software with our
newspaper projects. We use the standard languages as well as old German
and Chinese, Korean, and Japanese expansion languages. We typically
operate with automatic segmentation and save the resulting ABBYY XML OCR
files. We supplement the internal dictionaries with local dictionaries
of common names and locations from around Texas. Because a single
newspaper page may take up to three minutes to OCR, we operate
the ABBYY Recognition Server in a multi-node OCR cluster with 48 working
cores–allowing us to OCR 48 newspaper pages at once.
Software Name: PrimeOCR
Software Manufacturer: Prime Recognition
How we use it: General purpose OCR engine. We typically operate
with automatic segmentation and save the resulting .pro format OCR
files. We supplement the internal dictionaries with local dictionaries
built of common names and locations from around Texas.
Didn’t Find What You Need?
For information about software infrastructure used in our digital collections, see “About the Technology” in The Portal to Texas History or the UNT Digital Library.
For information about Web archiving software, see About Web Archiving.