julesrichardsonuk at yahoo.co.uk
Thu Nov 30 08:27:15 CST 2006
> In article <456D4EB3.7000106 at yahoo.co.uk>,
> Jules Richardson <julesrichardsonuk at yahoo.co.uk> writes:
>> It's OK if you have top-quality documentation. But lots of computer docs out
>> there are old, faded, dirty, creased, well-thumbed etc. and unless someone's
>> prepared to visually check every scanned page, there's a chance that the
>> bi-level algorithm in use will corrupt the data and it'll go unnoticed.
> I check every scanned page as I scan it. You have to anyway, because
> if it doesn't scan right you have to rescan it to get it right.
The main problem I find with that is that it's time-consuming to check every
page at the scanned resolution (i.e. 1:1 mapping between on-screen pixels and
scanned dots). However at a typical "fit page to window" zoom level it's easy
to make sure that the page was scanned straight etc., but easy to miss things
which might hinder some future OCR process.
No process is going to be perfect, of course, but we are maybe at a point in
terms of storage availability and transmission speed that we can handle a
quality improvement for the really hard to find stuff.
More information about the cctalk