pauamma: Cartooney crab holding drink (Default)
[personal profile] pauamma
So earlier today, I emailed the following to one of the bitsavers.org maintainers:
Subject: Home for OCRed and proofread manuals?

So I snarfed http://bitsavers.trailing-edge.com/pdf/ibm/704/24-6661-2_704_Manual_1955.pdf and I'm now OCRing it and proofreading the result. I read the http://bitsavers.trailing-edge.com/ bit that says: "Documents here are kept in a minimal subset of PDF format, just using it as a container for lossless Group 4 fax compression (ITU-T recommendation T.6) images. Contributions are normally post-processed by tools to put them in exactly this format, so that all of the documents here are the same and
can be burst at some point in the future when OCR technology is mature enough to do a good job of recognition." which seems to imply you're not interested in providing a subset of those documents as OCR'd images+text searchable PDFs. But since I'm going to do it anyway, I'd like to share it with others. If you can't or won't host it on your own servers, do you know of another organization that could?
Within 10 minutes, I got this answer:
I need to update that. I have been OCRing documents for several years now.
(I answered thanking him, and asking about adding an alt= to the harvesting blocker img for the email address. More when I know more.)

Profile

accessibility_fail: Universal "person in wheelchair" symbol, with wheelchair user holding a cutlass (Default)
You Fail At Accessibility

January 2017

S M T W T F S
1234567
891011121314
15161718192021
2223242526 2728
293031    

Syndicate

RSS Atom

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated May. 29th, 2017 11:03 pm
Powered by Dreamwidth Studios