- dotTech - http://dottech.org -
[Windows] Convert scanned PDF books and documents into electronic text files with PDF OCR
Posted By dotTech Staff On January 2, 2013 @ 12:00 AM In Windows | 15 Comments
[1]As you all know, converting scanned documents into electronic files is not the easiest process in the world. There are a few programs out there that claim to do this, but most of them fall well short of completing this goal. PDF OCR promises to be different from the others. So let’s find out how it does!
Main Functionality
PDF OCR is based on OCR (Optical Character Recognition) technology. The idea is for this program to convert scanned PDF files (paper books, documents, etc.) into editable electronic text files. PDF OCR comes with a build-in text editor, which allows you to edit the OCR results that you get without using MS Word. The program also supports batch mode to convert all pages of a PDF file to text at the same time. The program comes with a Scanned Image To PDF Converter as well. This means you can actually create your own scanned PDF books.
Perfect program for editing PDF files that were created using a Scan-to-PDF function that many scanners offer.
Pros
Cons
Discussion
[2]There are a few different programs out there that claim to use OCR technology. However, if you have ever tried this technology, you will know that most of them don’t work very well. The whole idea behind this technology is for the program to read non-editable text (whether from a scanned document or from a PDF file).
The idea behind this technology works a lot better than it is actually executed. That being said, I did find that PDF OCR worked better than some other free OCR programs. However, in the end, I am still not sure that the program is worth the price tag.
Let’s start off with what it does right. It has an easy-to-understand interface, so everyone can use it. Also, you can use the program as a standard PDF viewer; however, most of us already have programs for that anyway. The program will let you create editable text documents from scanned documents and PDF files. If you care converting easy-to-read text, the program works most of the time. It is when you start working with images that things get a bit…odd.
Now let’s get to the problems I had with the program. As I just talked about, the program works alright when dealing with just text. However, you are still going to need to proofread the document that it creates for you. Nine times out of ten, you are still going to find a few mistakes. If you want to pull text from images…you might as well forget about it. Sure the program says it can, but I was not able to pull text from any image and it come out readable. Maybe I was using the wrong kind of pictures, but I if I purchased this program I don’t want to be limited to what pictures it will actually be able to pull text from.
I wish the bad news ended there, but it doesn’t. When this program is converting anything, it becomes extremely resource hungry. As in, it was using over 80% of my computer’s resources during a standard conversion. Not only that, but the program does not seem to read or understand page breaks at all. This causes a lot of format errors even when no images are put into the mix.
If this program was free, I might say it is worth the download. However, with a price tag of $39.95 (and that is on sale mind you — regular price is higher), I just can’t recommend this program to anyone. This is not the worst OCR program I have ever used, but it is far from the best and I wouldn’t recommend anyone to drop $39.95 in return for the mediocre output quality this program offers.
Anyone looking for free OCR solution will be hard pressed to find it simply because good OCR is difficult to do and good OCR programs typically cost a lot of money. The best free OCR program I know of is gImageReader [3] — it uses an open source OCR engine — but even gImageReader has its quirks. If anyone knows of good OCR programs (free or paid), do let us know in the comments [4] below.
Price: Free to try, $39.95 to buy
Version reviewed: 4.1
Supported OS: Windows 2000 / XP / 2003 / Vista / 7
Download size: 13.7MB
VirusTotal malware scan results: 1/46 [5]
Is it portable? No
PDF OCR homepage [6]
Article printed from dotTech: http://dottech.org
URL to article: http://dottech.org/91691/windows-review-pdf-ocr/
URLs in this post:
[1] Image: http://cdn.dottech.org/media/2013/01/PDF-OCR.png
[2] Image: http://cdn.dottech.org/media/2013/01/PDF-OCR-Screenshot.png
[3] gImageReader: http://dottech.org/21372/gimagereader-open-source-google-powered-ocr-optical-character-recognition-program-that-actually-works/
[4] comments: #comments
[5] 1/46: https://www.virustotal.com/file/16de9d37b85757e0747779ea972ad06657b698395e6a17441c49daa65ec27134/analysis/
[6] PDF OCR homepage: http://www.pdfocr.net/
Click here to print.
© 2008-2012 dotTech.org | All content is the property of its rightful owner.