Dave Hall Consulting logo

Evince Blows my Mind!

A couple of days ago I was emailed a scanned invoice as a PDF. I was planning to just print it and file it, as the tax office here still requires dead tree records for 7 years last time I checked. Before printing it on 100% post consumer waste recycled paper, I opened it in evince. Nothing spectacular in any of that.

Then it happened, I accidentally clicked and dragged on the page. All of a sudden evince was highlighting the printed text on the page. This was a bitmap embedded in a PDF. Evince was using OCR to highlight the contents of the page.

There are moments every so often I am amazed by the features talented hackers add to FOSS. This was one of those moments. I will never look at evince the same way again.

evince showing the scanned page

evince showing the scanned page with highlighted text using OCR

I had a similar reaction when properly using the awesomebar in firebox 3 for the first time.

Update After seeing the comment below from Mr X, I checked evince with a few more PDFs and unfortunately evince wasn't doing OCR in real time. The text is embedded in the PDF. Maybe one time this will be possible. Any evince developers reading, please consider this a feature request.

I am still impressed with evince, just a little less impressed than I was.

Smart scanning, I think

Mr X wrote:

Hey Dave, I believe scanners with smart software on the host system will produce pdfs with the results from OCR embedded in them these days... try nosing through the pdf to find out.

Added Sat, 2008-06-07 06:36

Mr X is likely right

sgauna wrote:

I wanted to test this, so I created a jpeg in gimp with some text, opened up OpenOffice.org Writer, wrote some text with that, and then inserted the jpg image. I exported to PDF, opened it with evince, and was unable to select the text from the jpeg image. :/

Whatever created the PDF is probably responsible for the OCR translations, and not evince.

My Versions : gimp - GNU Image Manipulation Program version 2.4.5 oowriter - OpenOffice.org 2.4.1 evince - Evince Document Viewer 2.22.2 (using poppler 0.6.4 (cairo))

Added Sun, 2008-08-10 13:13