(2010-11-28, 8:10 pm)cb4960 Wrote:nest0r Wrote:I wrote a quick standalone program so that I could experiment with using OCR on Vobsubs:cb4960 Wrote:Nice! Seems like it would allow for consistent, optimal accuracy (within the OCR's current limitations).nest0r Wrote:It would be cool though if there was some kind of Anki OCR plugin. Hmmm...After writing Capture2Text, I now have some OCR experience. Maybe I will include OCR in subs2srs itself in some future update.
Edit: I did a couple minutes of testing with the Tiger and Dragon subtitles and it worked very well.
vobsub2text is a utility that uses OCR technology to automatically convert
VOBSUB subtitles (.idx/.sub) to subrip (.srt) subtitles
Japanese, Chinese and English language vobsubs are supported.
Download vobsub2text v0.50 via MediaFire (source code is included)
Requires the .Net Framework and Windows 2000/XP/Vista/7.
----
After some more experimentation I plan integrate OCR into subs2srs for use with VOBSUB subtitles.
cb4960
Any plans to put this on GitHub?
I had trouble running this on Windows 10, so I've compiled a newer version of nhocr, v0.21, from https://github.com/matthewn4444/win-nhocr (doesn't use cygwin) and recompiled vobsub2text with VS2015 and it worked.
Source + Binaries here: https://github.com/glebm/win-compiled/issues/3
Unfortunately, nhocr doesn't seem to recognize characters in italics at all. It also seems to get complex kanji wrong often. I also found that dark gray text color works better than pure black.
Edited: 2016-04-23, 4:44 am

![[Image: vobsub2text_v0.50.png]](http://subs2srs.sourceforge.net/vobsub2text/vobsub2text_v0.50.png)