Tesseract Studio is a Windows application for creating, reviewing and correcting OCR data and creating searchable PDF files.
It provides a graphical user interface in the .NET environment for the open source Tesseract OCR library, a modern neural network based optical recognition engine, actively developed and supported by Google engineers and other contributers.
TessStudio is designed for ad-hoc OCR and can process one file at a time. There are two licenses available for TessStudio:
| Community | Registered | |
|---|---|---|
| Support TIFF, JPEG, PNG images and multi-page PDF files, with or without prior OCR data. | ||
| For multi-page files, multiple instances of the tesseract engine run in parallel for improved performance. | ||
| Support OCR languages, including complex documents that use multiple languages. | ||
| The built-in spell checker automatically tags words not found in the dictionary. | ||
| Display OCR words on a faded background of the image with visible boundaries. | ||
| Edit OCR mistakes, add missing words, split, merge, delete or move recognized words. | ||
| Support any number of Undo and Redo operations. | ||
| Display sortable list of recognized words with Tesseract assigned confidence factors. | ||
| Preserve existing non-OCR text in PDF pages and limit OCR to embedded graphical objects. | ||
| Save the OCR data as text hidden behind images in searchable PDF format. | ||
| Apply OCR to a single page, specific pages, or all pages of multi-page source documents. | ||
| Pick a tesseract page segmentation mode for specialized layout analysis. | ||
| Support fixed threshold or dynamic algorithms for conversion to binary. | ||
| Perform image processing and cleanup (deskew, despeckle, grid and line removal, correct inverted blocks). | ||
| Debugging option to capture intermediate images and full recognition data. | ||
| Define regular expression rules for post processing OCR data. | ||
| Save entire document or specified pages from the document as a new PDF. | ||
| Save as Vector PDF, Searchable PDF, Raster Image or Text Only PDF. Use specific image formats, resolutions or fonts. | ||
| Save as an encrypted PDF or in a PDF/A compliant format. |