If you OCR, always archive the bitmaps too - Re: Regarding Manuals

couryhouse couryhouse at aol.com
Sun Sep 27 14:09:29 CDT 2015

We keep the graphics files as archive and to print from f o r displays. ... to read and search the pdf with inlaid ocr is   reference.   Ed# www.smecc.org

Sent from my Verizon Wireless 4G LTE smartphone

-------- Original message --------
From: Toby Thain <toby at telegraphics.com.au> 
Date: 09/27/2015  11:07 AM  (GMT-07:00) 
To: General at classiccmp.org, "Discussion at classiccmp.org:On-Topic and Off-Topic Posts" <cctalk at classiccmp.org> 
Subject: Re: If you OCR, always archive the bitmaps too - Re: Regarding Manuals 

On 2015-09-27 12:22 PM, Pontus Pihlgren wrote:
> On Sun, Sep 27, 2015 at 04:08:07PM +0200, Johnny Billquist wrote:
>> I don't have problems reading the current scans, as such. But when
>> having ten of these open at the same time, and scrolling through
>> them, it becomes obvious that the bitmaps are heavy. It can take a
>> while for the screen to be updated. Not to mention the problems you
>> sometimes hits with searching...
> It seems to me that a better tool could solve the issue. One that
> could display the OCR:ed content only and the scanned content
> only when desired, for instance when you suspect an error.
> Is there such a reader? Is the content organised to make it
> possible.
> /P

Right, if the bitmaps aren't available, then it's not an acceptable archive.

Personally I never, ever, want to see the OCR'd version. But that may be 
coloured by a career as typographer and finished artist. No software can 
apply the judgment that humans did in the print edition; it's only more 
or less degrading steps from that point on.

And to be clear I'm not at loggerheads with Johnny because I am indeed 
talking about acceptable archiving practice, not some conversion of a 
particular text which might be useful for a particular person on a 
particular day.


More information about the cctalk mailing list