Back

Use subs2srs to Create Anki Decks Based on Your Favorite Movie or Show

(2010-11-28, 8:10 pm)cb4960 Wrote:
nest0r Wrote:
cb4960 Wrote:
nest0r Wrote:It would be cool though if there was some kind of Anki OCR plugin. Hmmm...
After writing Capture2Text, I now have some OCR experience. Maybe I will include OCR in subs2srs itself in some future update.

Edit: I did a couple minutes of testing with the Tiger and Dragon subtitles and it worked very well.
Nice! Seems like it would allow for consistent, optimal accuracy (within the OCR's current limitations).
I wrote a quick standalone program so that I could experiment with using OCR on Vobsubs:

vobsub2text is a utility that uses OCR technology to automatically convert
VOBSUB subtitles (.idx/.sub) to subrip (.srt) subtitles

Japanese, Chinese and English language vobsubs are supported.

[Image: vobsub2text_v0.50.png]

Download vobsub2text v0.50 via MediaFire (source code is included)

Requires the .Net Framework and Windows 2000/XP/Vista/7.

----

After some more experimentation I plan integrate OCR into subs2srs for use with VOBSUB subtitles.

cb4960

Any plans to put this on GitHub?

I had trouble running this on Windows 10, so I've compiled a newer version of nhocr, v0.21, from https://github.com/matthewn4444/win-nhocr (doesn't use cygwin) and recompiled vobsub2text with VS2015 and it worked.

Source + Binaries here: https://github.com/glebm/win-compiled/issues/3

Unfortunately, nhocr doesn't seem to recognize characters in italics at all. It also seems to get complex kanji wrong often. I also found that dark gray text color works better than pure black.
Edited: 2016-04-23, 4:44 am
Reply

Messages In This Thread