Scanning

Paul Koning pkoning at equallogic.com
Thu Sep 22 10:16:30 CDT 2005


>>>>> "Barry" == Barry Watzman <Watzman at neo.rr.com> writes:

 Barry> Re:
 >> 1) Things that are mostly schematics.
 >> 
 >> 2) Things with a large text component that I might want to try to
 >> OCR.  Questions:
 >> 
 >> a) What resolution should I use for these two things?

 Barry> For OCR, it may depend on the OCR software, but for almost
 Barry> everything I have found that 300 dpi and gray-scale (8 bits
 Barry> per pixel) JPEG works best.  Grayscale is definitely better
 Barry> than black-and-white (1 bit per pixel) for reproduction that
 Barry> is indistinguishable from the original, but for some OCR
 Barry> software, black-and-white line art (1 bit per pixel) may work
 Barry> better.

I agree with you that grayscale is going to work better with many OCR
implementations (because it effectively provides anti-aliasing).

But avoid JPEG like the plague.  JPEG is designed ONLY for
photographic images, and is ONLY fit to be used for those.  It WILL
butcher any image with "hard edged" content -- text or line graphics.
You need TIFF or PNG (or even GIF) for that.  Not only will that
protect the quality of your scans, but it will also generally compress
better (smaller output files) than JPEG.

       paul




More information about the cctech mailing list