Forum Home
Press F1
 
Thread ID: 99911 2009-05-19 07:07:00 Lifting text from scanned doc in PDF form? supersi (8401) Press F1
Post ID Timestamp Content User
775104 2009-05-19 07:07:00 Does anyone know of a program which can select only the text from a PDF document that consists of a scanned page from a book?

I don't want to chew-through laser printer toner by reprinting every shadow and blocks of black from the scanned image. See pic to see what I'm taliking about Attached file: Untitled1242713243.jpg (www.imagef1.net.nz) (70 KB)
supersi (8401)
775105 2009-05-19 07:09:00 Just scan it with some graphics program, then erase the black borders. Only thing is, it'll be a jpeg not txt Speedy Gonzales (78)
775106 2009-05-19 07:11:00 I don't have access to the original book. I only have the PDF. supersi (8401)
775107 2009-05-19 07:17:00 Try the Text Viewer tool in Foxit. I think that does what you want. rumpty (2863)
775108 2009-05-19 07:19:00 Try this:
www.a-pdf.com

Or, you can use the select tool in Foxit or Adobe reader, copy and paste in word

Blam
Blam (54)
775109 2009-05-19 09:00:00 You could try it online at OCR terminal (ocrterminal.com/) - but requires you to sign up. Might be useful if you have several scanned PDF text to extract. Summary of how to use OCR is here at Digital Inspiration (www.labnol.org). More options with links/info for PDF to Text/Word, etc here... (search.labnol.org) kahawai chaser (3545)
775110 2009-05-19 10:16:00 hint

microsoft office will let you edit text scaned in as *.tif (ocr) look in the tools area of start.. programs.. msoffice. sub menu
beama (111)
775111 2009-05-19 10:21:00 also see want options are available in open office beama (111)
1