sarenya Wrote:Hello! I have read this whole thread, though not recently, so I apologize if this information is present somewhere and I've missed it.
I'm currently living in Japan and so have access to all sorts of fun Japanese movie content (Lord of the Rings dubbed, say). I'm wondering what the best way of going all the way from a physical DVD all the way to Anki cards is. It's my understanding that this is currently to manually convert the DVD to a movie, then convert the DVD subtitle tracks to external subtitle files, then feed that collection of files into subs2srs. If this is the case, that's fine, but if there are shortcuts available I'd be eager to exploit them.
In any event, this in an incredible tool and an amazing use of technology to take some of the gruntwork out of getting language resources.
I have done the dark knight , men in black , casino royale , etc...
I even picked straight from streaming for long show like gintama when I could nt find all the stuff (I came up with 3 different strategies just for streaming) so now I m pretty experienced with this stuff . I think I ve tried about all different strategies possible , even wrote a spreedsheet for personal use in order to to compare them . And as I ve been doing this for months I know how each of those really work on the long term (at least for me ).
So take my work for granted : unless you re a die hard fan of the movie (did this flick too by the way

) or you already have the japanese subtitles perfectly timed , do NOT mistake quantity for quality by importing the whole godamn movie . time consuming , mind confusing , fun spoiling
I must go with Khatz on this one : it s sentence picking not media mining .
Remark : It would be too long to go into details so I ll go straight to my point
I ll make a difference between net time and raw time :net time means how much time you REALLY spend on a task : for extracting a video it can take you up to 5 hours depending on what you re looking for but you can do something else during this time
Favorite way :
- just extract the japanese video : DVDfab is the best mix for effectiveness and simplicity . You can get it quicker or in better quality with other combination of tools see remark
net time 5 minutes raw times 3-4 hours
-demux the audio : virtualdub is the most basic tool . see remark net time 2 minutes
- and from then watch the movie whenever you feel like it
2 different ways
first time you watch the media (movie , anime , drama ,etc...):
watch with the timer (so mouse on the bottom of the screen) and snapshot to avoid breaking the flow . Zsscreen is the best .Here also there are several ways but just go for screenshot . see remark
avoid doing more than 30 shots . Otherwise it gets tedious . When I do more than 30 shots , I just take the first 30 ones and everything else goes in a stock folder .
from there you do the timing :
load the audio in aegisub (you can do it from the video but it s way longer . Aegisub can also extract from the video but it s still longer .) net time 2 minutes
the most efficient way for me is to use "on top" with aegisub so that I might see the screenshot taken in the background with the timer while aegisub is on the foreground . This way I speed up to the scene I m interested in . Once you mastered the shortcuts it takes about 5 minutes for 20 sentences and 10 for 50 sentences (I get quicker once in the mood). I don t like typing in Aegisub : it s awesome for timing if you know the shortcuts but it s very awkard for typing no matter how good you are . I import straigt away in anki and do the typing . This way you can review right on the spot .NEVER DO IT A CHORE . That s where sentence mining is wrong.
when you already know the movie : just watch it any time you want . It works even better as you have a grasp on the whole movie so you know what is the more interesting to you .Watch 4-5 scenes , A single one , change movies if you want . Just have fun . Repeat the same procedure as mentioned above except this time there is no need for snapshot : you just stop and input the sentence the way it is in aegisub.
The more movie you have the better it works . So a 700 mo -1,2 ga size is plenty enough : I have about 20 movies on my hard disk right now and I couldn t afford a 6,6g a piece (not to mention the HD behemot) . High quality pictures are all wrong : they re wrong about movie extraction , wrong about movie storage , wrong about anki storage and online save , wrong about anki review.
N.B: forgot to mention . I use premade subtitles or script once in a while (got the kenshin 追憶編 textfile today .) because it would be plain stupid not to take advantage of them but as it s binding as for the media choice I don t make a requirement and most of time I just watch what I want and write the subs myself. More fun.
N.B2: there are obviously plenty of other strategies and technique details I glossed over . For instance I never talked about extracting DVD subtitles for the timing as it goes clearly into media mining and I m against it .Didn t talk about either about VBRflux for the audio or splitting extraction and compression as I would be compelled to talk about tools. As I said earlier there s so much ground to cover I just talked about my favorite strategy : less time consuming , better result , more fun.
Edited: 2009-10-21, 12:37 am