Setting Keywords in PDF Files

If the document is black & white, character strings (such as the page heading and the title) can be set as keywords and used for a PDF file search.

To set keywords for PDF files, mark a character string to be set as a keyword with a water-based highlight pen so that the character string is completely covered. When you perform a scan, the marked character strings are recognized and set as keywords for the PDF file.

For details about marking a character string with a water-based highlight pen, refer to How to Mark Character Strings.

  1. In the ScanSnap setup window, select [PDF (*.pdf)] from the [File format] drop-down list in the [File option] tab.
    ScanSnap Setup Window
    HINT

    It is recommended that you select [Better] or [Best] for [Image quality] in the [Scanning] tab of the ScanSnap setup window.

  2. Select the [Set the marked text as a keyword for the PDF file] checkbox.
    ScanSnap Setup Window
    ATTENTION

    This checkbox is disabled when an unsupported language is specified for [Language].

    The following languages are supported:

    Japanese/English/French/German/Italian/Spanish/Chinese (simplified)/Chinese (traditional)/Korean/Russian/ Portuguese

    • The following message appears.
      ScanSnap Manager
  3. Click the [OK] button to close the message.
  4. Specify [Select OCR] and [OCR options].
    ScanSnap Setup Window
    ATTENTION

    Select [All marked sections] when the text orientation of your document is vertical.

    HINT

    [First marked section] that can be set in [Select OCR] is used as follows:

    • Select this button to set a character string such as a title of a document as a keyword for the PDF file.
      Example: When only the title of a document is marked, the marked character string is set as a keyword for the PDF file, and the PDF file becomes searchable by the title character string.
      First Marked Section 1
    • When multiple marked sections exist in line, the marked character string closest to the top of the document is set as a keyword.
      Example: In the following case, the character string in marked section B, which is higher than marked section A, is set as a keyword.
      First Marked Section 2
  5. Click the [OK] button to close the ScanSnap setup window.
ATTENTION
  • When you select the [Set the marked text as a keyword for the PDF file] checkbox, it may take extra time to process text recognition depending on your computer system environment.
  • Scanned images of the following types of documents (characters) may not be recognized correctly.

    In that case, better results in text recognition may be achieved by specifying a higher resolution in [Image quality].

    • Documents including handwritten characters
    • Documents with small characters scanned at a low resolution
    • Skewed documents
    • Documents written in languages other than the specified language
    • Documents including texts written in italic characters
    • Documents containing characters with superscripts/subscripts and complicated mathematical expressions
    • Documents with characters on an unevenly colored background
      Example: Shaded characters
    • Documents with many decorated characters
      Example: Decorated characters (embossed/outlined)
    • Documents with characters on a patterned background
      Example: Characters overlapping illustrations and diagrams
    • Documents with many characters contacting underlines or borders
  • It may take extra time to perform text recognition on the following documents:
    • Documents with complex layouts
    • Documents with information other than text
      Example: Text on a shaded background
  • If bleed-through reduction is enabled, the recognition rate may be lowered as the marker may be erased or lightened. In that case, disable it in the following procedure.

    Select [Scan Button Settings] → [Scanning] tab → [Option] button from the Right-Click Menu to show the [Scanning mode option] window. Clear the [Reduce bleed-through] checkbox (for SV600, the [Reduce bleed-through] checkbox is located in the [Image quality] tab on the [Scanning mode option] window).

  • If the same character string is marked several times in the document, the same keyword is added multiple times in the PDF file.
  • Total length of all the keywords can be up to 255 characters, including punctuation marks in between keywords.
  • When you check keywords in Adobe Acrobat or Adobe Reader, the added keywords may be displayed with a set of quotation marks (for example, "ABC").