Optical Character Recognition (OCR) Function of ABBYY FineReader for ScanSnap

This section explains about the OCR function of ABBYY FineReader for ScanSnap.

Overview of ABBYY FineReader for ScanSnap

ABBYY FineReader for ScanSnap is an application used exclusively with the ScanSnap. This program can perform text recognition only for PDF files created by using the ScanSnap. It cannot perform text recognition for files created using Adobe Acrobat or other applications.

Features of OCR Function

The OCR function has the following features. Before performing text recognition, check whether the documents are suitable for text recognition by referring to the following guidelines:

Application Suitable for Text Recognition Not Suitable for Text Recognition
ABBYY Scan to Word Documents with simple layouts consisting of single or double columns
Suitable for Text Recognition (Word File)
Documents with complex layouts containing a mixture of diagrams, tables, and texts (such as brochures, magazines and newspapers)
Not Suitable for Text Recognition (Word File)
ABBYY Scan to Excel(R) Documents containing simple tables with no cells merged
Suitable for Text Recognition (Excel File)
Documents containing:
  • Tables with no solid border lines
  • Tables with complicated cell formats
  • Complex tables containing sub-tables
  • Diagrams
  • Graphs
  • Photographs
  • Vertical text
Not Suitable for Text Recognition (Excel File)
ABBYY Scan to PowerPoint(R) Documents containing text and simple diagrams/tables on a white or light monocolor background
Suitable for Text Recognition (PowerPoint(R) Document)
  • Documents with complex layouts containing text mixed with diagrams or illustrations
  • Documents containing photographs or patterns set as the background
  • Documents with light colored text on a deep colored background
Not Suitable for Text Recognition (PowerPoint(R) Document)

Information That Cannot Be Reproduced as in the Original Document

The following parameters may not be reproduced as they are in the original document. It is recommended that you check the results of the text recognition in Word, Excel or PowerPoint and, if necessary, edit the data.

  • Character font and size
  • Character and line spacing
  • Underlined, bold, and italic characters
  • Superscript/subscript

Documents That Cannot Be Recognized Correctly

The following types of documents may not be recognized correctly. Better results in text recognition may be achieved by changing the color mode or increasing the resolution.

  • Documents including handwritten characters
  • Documents containing small characters (smaller than a font size of 10)
  • Skewed documents
  • Documents written in languages other than the specified language
  • Documents with characters on an unevenly colored background
    Example: Shaded characters
  • Documents with many decorated characters
    Example: Decorated characters (embossed/outlined)
  • Documents with characters on a patterned background
    Example: Characters overlapping illustrations and diagrams
  • Documents with many characters contacting underlines or borders
  • Documents with a complex layout and documents with a large amount of image noise

    (It may take extra time to process text recognition for these documents.)

Other Considerations

  • When you convert a document that is longer than the maximum size allowed by Word, the maximum paper size available for Word may be used.
  • When you convert a document to Excel files, if the recognition result exceeds 65,536 lines, no more results are saved.
  • When you convert a document to Excel files, information about the layout of the entire document, diagrams, and length/width of graphs and tables is not duplicated. Only tables and character strings are reproduced.
  • A converted PowerPoint document will not have the original background color and patterns.
  • Documents placed upside down or in landscape orientation cannot be recognized correctly. Use Rotating a Scanned Image to Its Correct Orientation, or place documents in the correct orientation.
  • If bleed-through reduction is enabled, the recognition rate may be lower. In that case, disable it in the following procedure.

    Select [Scan Button Settings] → [Scanning] tab → [Option] button from the Right-Click Menu to show the [Scanning mode option] window. Clear the [Reduce bleed-through] checkbox (for SV600, the [Reduce bleed-through] checkbox is located in the [Image quality] tab on the [Scanning mode option] window).