How to use this service

  1. Make a document image, using an image scanner, a screen capture software, etc. Typical scanning resolution is from 200dpi to 400dpi.
  2. Clip a text block in the scanned image.
    You may skip this step, if an automatic layout analysis is supported on the server.
  3. Convert the image into a binary image if possible. A clear B/W image is preferable.
    You should skip this step if the OCR engine used on this server has a special preprocessor for color and/or grayscale images.
  4. Save the image in one of TIF, PNG, PBM/PGM/PPM formats and try to set the quality to 100%.
  5. Select the image/PDF file on the WeOCR server's top page, and press "GO - Extract Text" button. You will see the results of character recognition on the Web browser's screen.
    The processing time varies depending on the load and speed of the server.

TIPS

  • If you cannot upload your files using MS-Windows & Internet Explorer, please try to move the files to another directory that has a simple path name, and upload them from there. For example, non-ascii path names might cause some troubles.
  • Although JPEG (JFIF) files are faster in data transmission, OCR performance would deteriorate due to the lossy image compression.
  • If you are disappointed at poor performance of the OCR results, try to enlarge the document images. Typical text size on computer screens is too small in many cases.
  • Many OCR engines cannot handle white letters on a black background correctly. If you think something is wrong with the output, try to invert the image.