Joined: Mar 2008
Posts: 1,049
Thanks:
4
I gave it a try just now on a couple full pages of text, and it did remarkably well in my opinion. I "scanned" the pages by photographing them with my digital camera, and they were quite noisy. Without any involvement on my part, it recognized all the text fine, with about 95% accuracy. Furigana seemed to confuse it a little though. This really beats typing everything out by hand!
Joined: Mar 2007
Posts: 3,851
Thanks:
0
Wonder if it'll work under wine (crossover).
Joined: Mar 2008
Posts: 1,049
Thanks:
4
Press the 3rd button on the toolbar to open a file. If you go to the little dropdown thing beside the button, you can choose from: file, folder, TWAIN device, clipboard.
So anyways, you click that button and load up a source image for it to OCR.
You should see the image come up now, and it will put a box around parts that it recognizes as text, and labels each one with a yellow number. Now at this point there are a ton of options and things that you can play around with, and I don't understand how to use it so well myself... but you can draw your own boxes around text that you want it to try to recognize.
Now on the very right side it will show you what it recognized. If you notice something is wrong, you can right click it, and i think it displays a list of alternatives. The drop down box on this window will let you change the layout.
Finally, click the save button to save the text to a file. I prefer using normal text files, otherwise it will try to do weird things with the formatting.
Joined: Jun 2008
Posts: 2,009
Thanks:
1
Thank you. I manage to get as far as opening a file and got lost at first.
Do you know if it will accept more than one page of a pdf? I opened one and a random page showed? Perhaps I'm missing how to cycle through...
Joined: Mar 2008
Posts: 1,049
Thanks:
4
sorry, dunno.
A workaround might be to screenshot each page of the pdf and work with that?
Edited: 2009-02-01, 8:04 pm
Joined: Mar 2008
Posts: 1,049
Thanks:
4
I checked out the RealReader lite software, it seems to be a continuation SmartOCRLite.
I couldn't get version 8 to work right though, and version 7 just seems like SmartOCR with different colors in the interface.
In any case, does anyone know if there is a way to batch process files in this software? I am trying to OCR some DVD subtitles. Its very accurate, but I'm currently doing them one at a time which is taking me forever.
I can load all of the files in at once, but I can't find a way to batch process and save them as text.