Back

Use subs2srs to Create Anki Decks Based on Your Favorite Movie or Show

#1
Hello,

subs2srs allows you to create import files for Anki or other Spaced Repetition Systems (SRS) based on your favorite foreign language movies and TV shows to aid in the language learning process.

Download the latest version of subs2srs via SourceForge

View the documentation for subs2srs (includes screenshots)

You will need Windows (XP/Vista/7) and .Net Framework v3.5 installed.

Have Fun!
cb4960
Edited: 2014-06-21, 9:43 am
Reply
#2
Well done, cb4960! I must say this program looks wonderful! I haven't experimented much with putting audio/pictures/etc into anki yet, but this program makes me want to try it!

I've seen a ton of shows and movies that would have some excellent lines to know, but I would usually just add them as plain text with no other accompanying items (besides a reference to where it came from). However, this seems like a much better way to do that, and quite simple too!

Thanks a lot! As soon as I get a chance (not soon) I'll try it out, but I'm sure someone will beat me to that.
Reply
#3
Very nicely done, it's too bad Japanese subs are still hard to come by.
I'll certainly be using this where I can.
Reply
JapanesePod101
#4
I'd love to see a linux port of this. Cool stuff.
Reply
#5
stoked Wrote:I'd love to see a linux port of this. Cool stuff.
ooo, yes, I forgot to say that. A linux port would be amazing.
Reply
#6
Thanks! This looks a lot easier.
Reply
#7
If you can, just virtualize windows with vmware or virtualbox... It works incredibly well

I'm going to try that program very soon Big Grin

Thanks a lot for make it simpler for mere mortals Wink
Reply
#8
cb4960 Wrote:Hello,

I have created a new version of subs2srs. See the usage file (attached below) to get an idea of what you can do with this utility.
This is fully awesome! Will you ever make the source code available ?? [An interested developer]
Reply
#9
cb4960, this is just too awesome. seriously, my hats off to you, pal.
Reply
#10
Just want to say I'm 100 cards into this deck already and the file works brilliantly. It's such a great tool, with maybe the best potential of any I've used. It provides natural native sound files, with full japanese subs so you can look up words/grammar you don't know, and offered in sentence by sentence chunks that you can SRS. It's a goldmine. Today I rewatched the first 15 minutes of the anime and all the little bits I couldn't figure out before are clear as day. Awesome thanks. Unfortunately I work on a mac, so I'll have to track down a sympathetic friend to create other decks with this program and share them. Many many thanks.
Edited: 2009-02-01, 3:07 am
Reply
#11
GD! Thats cool!
Reply
#12
This is just amazing. You are very talented to create such an application and very generous to share it with others.
Reply
#13
First great respect and thanks for this programm! I have a problem/question concerning snapshots:
I donwloaded avisynth 2.5. Chose the option to create snapshots.
subs2srs created audio-files and a text file, however the snapshots were not created (the media-directory did not include them).
I made a snapshot of the snapshotripping process and it reads like this:

Source :
*Filename:"d:\....avs"
*Fourcc: None (RGB32)
*Frames: 240
*Resolution: 1480x342
*Frame rate: 24.000 FPS
Compressor:
* No Recompression
Destination:
*Format: Null

I did not change the standard framerate and other settings in subs2srs.

Also while the sub-file I have seems pretty consinstent in timing, I found out, that the subs are a little bit behind the sound so 你好 only produces 好 sound-wise. So maybe an adjusting option would be good to have. (and the people who sub stuff probably do so at their own definition on what is "correct" timing)
Edited: 2009-02-01, 7:00 am
Reply
#14
Thanks m8t, with the sound too? this will be a great help, I'm far from using it, but I already downloaded it and bookmarked the page for instructions, maybe there will be a better version by the time I need it, but its better this way, thanks a bunch!
Reply
#15
Won't the import files be quite enormous? I mean, just look at a random subtitle file, it's generally several hundred lines of dialogue. Getting through just one movie would probably take you quite a while... and most of the dialogue will probably be stuff you allready understand.

While the program is really cool, I'm wondering if it's such a good idea to use it. To bring up Khazu, he usually says that you need to learn 10 000 sentences as fast as possible, make them count by picking out the ones you really need. Taking every line in a whole movie isn't really picking the important ones.
Reply
#16
Agree with Tobberoth - but still if there was a "input editor" of some kind you could skim through the lines and only check/uncheck those that you think are important/(not important) for you.
The possibilities are still huge!
Reply
#17
Tobberoth Wrote:Won't the import files be quite enormous? I mean, just look at a random subtitle file, it's generally several hundred lines of dialogue. Getting through just one movie would probably take you quite a while... and most of the dialogue will probably be stuff you allready understand.

While the program is really cool, I'm wondering if it's such a good idea to use it. To bring up Khazu, he usually says that you need to learn 10 000 sentences as fast as possible, make them count by picking out the ones you really need. Taking every line in a whole movie isn't really picking the important ones.
This software makes it easier to develop decks based both on what individuals want and what they need, based on analyses and manipulation of the data once it's collected. Gives us more control, allowing for diverse, distributable, user-specific corpora. There's all kinds of possibilities with this, I'm sure people that are more database/list savvy than I can offer and develop more concrete examples.

And I mean, in addition to the interface/filtering stuff in regards to culling excess words/sentences from these specific decks. (I guess you could do some kind of check against import files you already have, eliminating redundant lines?) Just imagine, for example, analyzing the decks created by different users, cross-referencing them based on different taxonomies/genres/themes (to create, for example, frequency lists), and creating condensed decks from those that a person can select depending on whatever a person wants or needs to study.

Is there someplace to use as a 'headquarters' now that ajatt.pseudosphere is gone? Bad timing with that.
Edited: 2009-02-01, 8:15 am
Reply
#18
nest0r Wrote:Is there someplace to use as a 'headquarters' now that ajatt.pseudosphere is gone? Bad timing with that.
I wonder about the copyright issues with distributing/sharing the decks. ajatt.pseudosphere gave access to sentences from books only to those with proof of ownership but I remember they freely posted links to subs - what would be acceptable for distributing the decks?
Reply
#19
Just from looking at this and seeing website like reading tutor (http://language.tiu.ac.jp/index_e.html) and iKnow, it's easy to see what can happen:

User creates file from a movie (sadly, all sorts of copyright issues, but bear with me). User uploads files onto iKnow so it has image, audio, sentence, and translation (maybe in multiple languages). iKnow parses inputted sentences so now any and all words used in the movie are connected to the files. An upgraded version of iKnow lets you automatically load all sentences from that file with new words not yet studied by you (I say new version as this is not possible yet).

Users that don't like using iKnow then use Anki to download that generated list so now they have only the sentences from the movie file that are words they have not learned.

Yeah, a good ways in the future. However, looking at upgrades to iKnow, Anki, Reading Tutor, and this recent program, it's easy to see there's much that can be done to streamline the self study process.
Reply
#20
Tobberoth,
what you say is true, but it's still much faster to remove the cards you don't need than manually adding the ones you need, isn't it?
Reply
#21
So how would one go about ripping subtitles from a movie into a subtitle file that can be put to good use with this program? If OCR is the only way, I think I'll pass (had bad experiences with using OCR back in the day).
Reply
#22
Kreva, there is one option but you may not like it: Create a Hard sub video with Japanese Kanji (I use Xilisoft DVD to DivX to make mine). Then subs2srs with an English sub file for appropriate parsing.

Thing is though, it's going to suck as the English sub timing is not going to match up well with the Japanese. Like I said, it's an an option you may not like.

The only other option is just use sub-files that exist like from drama addict forums. In truth, a one hour show should net you 700 "cards", which should last you a bit. Granted, ever new show you add should have less and less useful cards as in new vocabulary or phrases.
Reply
#23
tobberoth, i set up a shortcut in anki to suspend (or delete) a fact, so as i'm learning new cards i just hit that key every time i completely understand a line. takes like 2 seconds each time and i basically get to watch the movie as i'm doing it (they're presented in order).
Reply
#24
Tobberoth Wrote:Won't the import files be quite enormous? I mean, just look at a random subtitle file, it's generally several hundred lines of dialogue. Getting through just one movie would probably take you quite a while... and most of the dialogue will probably be stuff you allready understand.

While the program is really cool, I'm wondering if it's such a good idea to use it. To bring up Khazu, he usually says that you need to learn 10 000 sentences as fast as possible, make them count by picking out the ones you really need. Taking every line in a whole movie isn't really picking the important ones.
I just suspend the cards I don't need. It's much faster this way than cutting and pasting a thousand sound files into Anki, which is a great way to learn yet very time consuming - I've tried it.

Without hijacking the thread for more Khatz commentary (he has his own thread going hot and heavy right now) he also said, and correctly I believe, that you need to enjoy what you're learning. I love this movie, and I'm having a great time working through the sound files. For other people, that might not be so. OK, find a movie you love. Or don't do it at all. Doesn't matter to me.

Also, for shadowing purposes, movie and TV clips have the most natural Japanese, and using sound files in anki makes this really easy. I don't shadow the main character, since she's a teenage girl, but her guy friends are fair game. It's much better than stopping and rewinding a CD, and the voices are very natural in this movie.
Reply
#25
Yes, suspending cards is easy enough, though I do like the idea of integrating a level checker with some kind of overarching user database. I just keep thinking about someone who's learning Japanese, and is in a mode where they want to learn Japanese 'in general', but want, say, real-world 'business Japanese' lessons, or to know enough Japanese to watch plenty of current anime of any specific type (scifi, slice of life, shounen, whatever), and can just check out a file made from the most frequent words of 50 different sources of that type, eliminating redundancies with a level checker, et cetera.

Also, I've actually got a large number of redundant cards from iKnow, but I keep them for the speaking practice--I either suspend a card or just grade it on how well I can accurately reproduce the basic pitch and flow. It really helps with my 'ear training' and development of subvocal/articulatory rehearsal (and speaking ;p).

I guess someone needs to hijack a Somalian pirate ship and set up wifi there to resolve our copyright dilemma. Our motto: "Avast ye landlubbers, no int'l copyright laws be keelhaulin' arrr language-learning, yarrr! Yo ho ho, どうもありがとう!"
Edited: 2009-02-01, 5:00 pm
Reply