![]() |
|
Japanese VOB sub to plain text? - Printable Version +- kanji koohii FORUM (http://forum.koohii.com) +-- Forum: Learning Japanese (http://forum.koohii.com/forum-4.html) +--- Forum: Learning resources (http://forum.koohii.com/forum-9.html) +--- Thread: Japanese VOB sub to plain text? (/thread-10634.html) |
Japanese VOB sub to plain text? - kodorakun - 2013-03-17 Hi All, Is there a tried, true, and ideally free way to go from ripped DVD VOBSUB subtitles to plain text? Now that I live in Japan and have access to renting DVDs there's a goldmine of subtitles waiting to be got. If anyone can help me out I'll rent a couple DVDs in your name and rip the subs (if they exist -- but almost all DVDs have subtitles) for you ![]() Cheers, K. Japanese VOB sub to plain text? - Zarxrax - 2013-03-17 The only thing I have had any serious luck with was SmartOCR ( http://forum.koohii.com/showthread.php?tid=2480 ) I used it on a full movie, but had to OCR each image individually. It had something like 95% accuracy, took me several hours to do 1 movie. (You need to read each line eventually anyways, so doing it during the ocr step isnt REALLY all that much extra work. It can be part of the learning process. Japanese VOB sub to plain text? - toshiromiballza - 2013-03-17 Here's something from another forum: Quote:I looked around for OCR software to process that sequence of BMPs generated by SubRip. I successfully tested Abbyy Finereader 10. It is expensive, but the 15-day trial version does the job (for 15 days anyway). Here is what I did: Japanese VOB sub to plain text? - Oniichan - 2013-03-17 Maybe this... http://forum.koohii.com/showthread.php?pid=120666#pid120666 Japanese VOB sub to plain text? - kodorakun - 2013-03-18 Oniichan Wrote:Maybe this...Thanks oniichan, I think this is probably the easiest method. I tired it out, worked fine and relatively fast, minimal management too. I have to run a virtual box with windows but that's not so bad. If you have any subtitle requests Iet me know and I'll see if Tsutaya has anything in stock. k. Japanese VOB sub to plain text? - Zarxrax - 2013-03-18 That is actually working for you? When I tried it on a movie, the accuracy was somewhere around 30-50% or so. Completely unusable for me. Maybe it was just the movie I had, I guess. Japanese VOB sub to plain text? - Rayath - 2013-03-18 kodorakun, sorry for a little off-topic but you got me interested, So most of the dramas and movies in Japan on DVD have subtitles (even older ones)? Is Tsutaya the best place to rent? How much they take for renting? Japanese VOB sub to plain text? - kodorakun - 2013-03-18 Zarxrax: No problems at all for me. I ripped the subs to .idx file and loaded them up, everything went fine. The subs themselves (plaintext form) had some glitches here and there, but after eye-scanning them it looked more or less reliable and worthwhile. The glitches were caused by weird symbols like parentheses and question marks or Japanese quotation marks as far as I could tell. Rayath: Yeah, dude! I don't know why people on this forum haven't gone Tsutaya crazy, but there are HUGE selections of western films, western dramas, japanese films, japanese dramas and they all have Japanese subtitles (though J subtitles often don't match the J-dub on foreign content films). Some movies don't come with subtitles, to be fair... Unfortunately for anime fans it seems anime is one category that almost uniformly does NOT have subtitles included. The more popular animations do sometimes have subtitles, though, but something like long series (e.g. One Piece) do not. It's 100Yen for ONE WEEK to rent not-new DVDs. And you can also rent Japanese music CDs... As far as I can tell Tsutaya is a jackpot for media content. The new releases are something like 300 or 400 yen rentals for a night or two nights, can't remember... I usually rent Aibou or some old content as there is so much available and it's cheap. K Japanese VOB sub to plain text? - Rayath - 2013-03-18 Wow, sounds really good. So I've read that if you rip subs from DVDs, they come in a image format. Can you try to use them like that with avi files later, or you need to change them into text somehow? Japanese VOB sub to plain text? - tokyostyle - 2013-03-18 kodorakun Wrote:Thanks oniichan, I think this is probably the easiest method. I tired it out, worked fine and relatively fast, minimal management too. I have to run a virtual box with windows but that's not so bad.subs2srs and vob2text both work perfectly fine in wine and CrossOver. For CrossOver specifically all you need to do is create a bottle and install the Microsoft .NET 3.5 SP1 package and it will sort out all of the dependancies for you. Zarxrax Wrote:That is actually working for you?Just use the .SRT file from vob2text as the L2 and the .IDX from the vobsub as the L1 and then the original subtitles will be on the back of your card for corrections. Japanese VOB sub to plain text? - kodorakun - 2013-03-19 Rayath Wrote:Wow, sounds really good. So I've read that if you rip subs from DVDs, they come in a image format. Can you try to use them like that with avi files later, or you need to change them into text somehow?Well they come out in VOBSUB format, which is kind of like an image. If you use subs2srs to process the vobsub files they insert as PNG files (images) in anki decks after some processing. After the OCR processing you get an srt file, which is plaintext. In any case, skipping all that you can rip DVDs with handbrake and select all the subtitle files to be soft-coded (not permanently displayed) into the ripped mkv/mp4 file. You can extract these subs to a file if you want, or just leave them embedded in the video file so you can optionally turn them on and off or swap between E and J (if both E and J are available). Or you can even manually insert E subs you find online during the encoding. Look up Handbrake, it's pretty sweet. K. Japanese VOB sub to plain text? - eslang - 2013-04-30 Zarxrax Wrote:That is actually working for you?It works pretty well (about 80% accuracy) on some movie (idx+sub) files, but not on the others (almost 90% gibberish)... if I'm not mistaken, it may depend on the background color, font color, font border, etc... basically the colors and transparency of the subtitles file for the OCR to work properly. Personally, I find it easier to transcribe from the image files. On the average, it takes about 3-4 hours for a 90 minutes (Japanese) movie. |