Another thing to do, is if you print it out and you have a copier at work, is enlarge the
printout a bit to make it more legible, then go over it with pen and rescan it.
Sent from my iPhone
On Nov 30, 2025, at 14:25, Wayne S <wayne.sudol(a)hotmail.com> wrote:
He don’t like they say, garbage in, garbage out.
Pre-processing will make things much faster and cleaner.
You might want to acquire one of those cheap HP desktop scanners with a feed and then
print out the patent code on 8 1/2 x 11 paper and go over it.
Then rescan it and see if it makes a difference.
Sent from my iPhone
On Nov 30, 2025, at 14:19, Adrian Godwin <artgodwin(a)gmail.com> wrote:
No, it's a scanned listing from a patent. I think it's on the Australian ho museum
site. But that does make me wonder if preprocessing the input image files would be faster
than postprocessing the text output files.
On Sun, 30 Nov 2025, 21:53 Wayne S,
<wayne.sudol@hotmail.com<mailto:wayne.sudol@hotmail.com>> wrote:
If you have the paper copy, you can help the ocr’ing by using a pen and tracing lightly
missing characters.
It helps quite a bit and is fast to do.
Sent from my iPhone
On Nov 30, 2025, at 13:42, Adrian Godwin via cctalk
<cctalk@classiccmp.org<mailto:cctalk@classiccmp.org>> wrote:
I've been trying to OCR the lineprinter listing of the 9815 calculator. I
tried a few OCRs and the most accurate was onlineocr.net<http://onlineocr.net> .
It's still full
of errors and limited to 15 pages per day, but may be useful for some needs.
> On Sat, Nov 29, 2025 at 6:51 AM bostjan spetic via cctalk <
> cctalk@classiccmp.org<mailto:cctalk@classiccmp.org>> wrote:
>
> I use a simple script that uploads each file to chatgpt and collects the
> transcript. It works shockingly well compared to all ocr solutions.
>
>