subs2srs - How are you using it? What are you using with it?

Index » Learning resources

NickT Member
From: London Registered: 2006-12-16 Posts: 109

Ghinzdra: I just used Aegisub. It worked perfectly for me, I didn't have any problems with it turning my japanese subs into gibberish.

What language is your system set to? Do you use Japanese Windows?

ghinzdra Member
From: japan Registered: 2008-01-07 Posts: 499

I think you misunderstood my post : aegisub works great .
It's just that to make any change aegisub must convert your subtitle file to the UTF 8 format as it's aegis standard . It's a conversion , in any case not a corruption .
Unfortunetaly MPC , one of the most widespread mediaplayer out there , has trouble to use this format for MPC softsub fonction . This drawback has even been clearly identified on aegis webpage . MPC is the culprit , not aegisub. And I activated IME japanese but there is no relation whatsoever with this  problem .

Anyway there is no problem anymore as zodiac tipped me off about the playline feature of aegis which is even better to check the result .
Thanks for caring though .

Last edited by ghinzdra (2009 February 20, 11:05 am)

NickT Member
From: London Registered: 2006-12-16 Posts: 109

No, I understood. I used Aegisub, and if by "MPC" you mean Media Player Classic, I use that too. It all worked perfectly. I can watch re-timed and converted subs in MPC without any issues.

When I asked if you use Japanese Windows, I wasn't referring to IME (which can be used on any version Windows), but the actual base language of the operating system itself. The Japanese version of Windows handles non-Unicode text and programs differently than the English version, which is the only reason I can think of as to why it works on my system and not yours.

Advertising (register and sign in to hide this)
JapanesePod101 Sponsor
 
Reply #29 - 2009 March 08, 6:52 am
ghinzdra Member
From: japan Registered: 2008-01-07 Posts: 499

I ve been doing a lot of tests lately with sub2srs and I was wondering how everyone else was doing ?

1st question: on the use of "plain" subs without really editing them through a sub editor
From what I've read it seems obvious that most people are taking the subs the way they are without any manipulation except time shifting , as on the beginning there was much complain about useless lines and time lost on suppressing them one by one .... this issue has been settled with advanced function , time limit ,etc....
Still I keep wondering how is it possible to use the sub this way? I mean I tried twice and there are so many lines that don't make sense if you don't merge them , trim them , shape them through a subeditor like aegisub so that they fit your needs  that I just don't get how can they be used "plain" . Subs are designed to translate a ongoing process , what we are looking for is a single unit of meaning : so not too long lest you aren't able to remember and write/mimick it (I reached the conclusion that as much as possible an extract longer than 10 seconde should be avoided ) . Not too short or you're deprived of the necessary background to get the meaning. And it's a really touchy exercice . So it requires a real analysis.

2nd question : on what is really useless
for some of us it seems also that as long as the sentence is unsterdood  it's useless . Nevertheless I think a well understood sentence can still be worth of putting into your srs . I mean for those who agree with ajatt philosophy it seems clear that if we endeavour to get as much input as possible it's to improve our output . And 90% of the ouput is not some really brillant conversation , rather some meaningless chitchat that we're just copying ....on a first look some worthless 1st grader talk . You don't ponder about the meaning of what you're saying , it just pops up . It's casual interaction.
if I take Deathnote 01 for instance つまねんな やんでるな まじで あんなのみにいたの 勉強してるとこごめね  are easy to understand  but am I THAT confident that I know them so well that I'll be able to USE them automatically ? I think one of the interest of using real japanese stuff  is not only to spark more interest than formal stuff (and so carry on studying complex sentences and rare kanjis without being bored to death)    , it's also to get all the worthless stuff that is actually the really worthy thing . what makes for real fluency . And you need to get that a LOT to achieve real fluency . As far as I'm concerned I take both unknown kanjis based sentences and extremely casual chit chat (as long as I deem that I'm really likely to use it ,that I can envision myself saying that)



3rd question : on the pace
Do you feel you're going through this faster than it used to be when you were maybe putting in grammar book examples into Anki ?

Last edited by ghinzdra (2009 March 08, 7:09 am)

Reply #30 - 2009 March 08, 2:58 pm
Nukemarine Member
From: 神奈川 Registered: 2007-07-15 Posts: 2347

I think it depends on what level of Japanese you are at. It's a source of sentence mining early on, looking for words you don't know. In that, yes, if you understand the sentence, it's not needed to push your learning.

As you said, it's a source for shadowing. Shadowing longer sentences is difficult, but great for shorter ones. Shadowing idle banter is good also as that's a learned trait. Better than always saying "Eee, toooeee" all the time. Not sure if there's a refined process, but I think I'll go through a movie once, separate out the dialogue for my gender (using easy and hard buttons for that) and delete the rest. When you such an Anki file, it'll copy over only the sound files that you kept, right?

I was thinking of the above to even make a shadowing .mp3 with Audacity. Shouldn't be a stretch to add in blank spaces even with the audio length of each clip.

Lastly, one can go through the subtitle files and merge timings on long conversations upto a minute or more. This is really easy with Aegisub. These files can be put in Anki as training listening skills.

Of course, with the above, I was going to do similar early on. In that, I'd use anki for an episode. It shows cards in order, so the sentences come in order. When I'm done with the episode, I can reschedule all the cards for being 4 days later. They will all come due again in 4 days, and I'll have Anki show them to me in order due ie story order.

To be fair, I haven't been using subs2srs completely yet. I've been messing around with it ensuring it works, and thinking about how I can use it. Recent ideas such as timing edits for long conversation cards, rescheduling entire episodes (at further and further intervals), or using it to help make shadowing .mp3's with chosen lines are just recent thoughts.

Reply #31 - 2009 March 08, 4:39 pm
ghinzdra Member
From: japan Registered: 2008-01-07 Posts: 499

I've been thinking about the different ways of selecting interesting lines and suppressing irrelevant ones.
1st option THE PLAIN METHOD /ANKI
-put the whole stuff into anki and once it's in  suppress them manualy .
advantage :  on the first look you feel you can have more time to actually deal with the file , take it one bit at a time , you do the review and the edit at the same time....
drawbacks : 1 actually it's damn too long : you spend your time on the suppress button and what's more your former review gets in the way of the process
2 it leaves behind unused audio samples in your media folder . As your anki file is meant to stay by you for some years  , I think one should be wary about the media folder size . Take only what is  necesseray
3 it gives you lines the way they're intented to be displayed on the screen as subtitles not the way you need them for anki. As I said before the logic is not the same and if fits the size of your screen and your reading pace  it doesn't fit your shadowing or understanding needs  . You do need to edit them .
I did that for my secund try and I definetely did not like it for the reasons above.
By a side effect ,if you agree that it's ineffective to use subs2srs this way it also questions  the batch function as it's plain stupid to try to gain some time on the processing if you have enough time to edit several sub files (wich is really longer) .

2nd option THE EDIT METHOD/ AEGISUB
- edit the subfile with aegisub  and merge , suppress  and THEN import into anki .
advantages : You deal with most of the previous issues this way as you really take what you need .
drawbacks :  you still have to go through the whole file .
It's my standard method right now .

3rd option : THE POINTING METHOD /SUB2SRS
watch the media (anime, drama , movie ,etc...) and point the time or a specific word  When the show is over , you use advanced function of sub2srs to take only lines in a specific time or including those words .
advantages : funnier (you really watch the media )  and quicker (instead of suppressing  you select)
drawbacks : -it's less accurate than the previous one . You'll have a bit of editing after that .
- it doesn't resolve the format issue : you ll still have some sentences that will be butchered owing to the subtitles logic.
- it entails that what you take is by far smaller than what you drop . If by the end of the show you have something like 70 lines I don't think you're meant to use this method.
I think it's a method for someone at really advanced level and who only needs some very specific line .




About your story method and shadowing mp3 method



I also had some thoughts about your "story method" . After a while I eventually discarded it as it appeared that for one part  every SRS is designed to give different treatment to different difficulty leveled propositions : if you take the whole story at a "4 days-4 weeks -4 years reviews" level then it seems to me  you're going against the very philosophy of the SRS :some propositions/sentences are just harder than others , that's the whole point of the leitner box/srs.  For another part I realized that the deeper your knowledge of japanese  , the scarcer the selected lines . Which means in a way or an another the story is broke up .The figures about useless lines are overestimated in my opinion but they do exist though. And those figures will grow up and up as you get better . How can you really make sense of a story with 5 minutes gap every four lines ?

the "Shadowing mp3 method" makes more sense . But my journey into japenese has learned me a lot of thing among which the value of time . Don't waste your time . There is a lot of great and cool material out there but if it takes too much time to process 99% of the time it's better to take something else that you can use quicker . I definetely could ripp off the audio of the online news for the sentences i'm interested in ....except that I would have to 1 locate the mms address  2 download the video file 3 extract the sound  4 cut the file so that I might get the one sentence I'm interested in .... too much time consuming.
So as this shadowing method requires additionnal manipulation , cut , trim, merge , add blank ,etc... I think I'll go on with my ripp off policy . As khatz I just ripp off  the audio track of the media (which takes me 5 minutes flat , 4 of which are used to something else)  and take it on my ipod . If something catchs my attention and I feel like shadowing , I just have to rewind . For all that I don't think your shadowing method is fatally flawed .

Last edited by ghinzdra (2009 March 08, 5:12 pm)

Reply #32 - 2009 March 08, 6:40 pm
Nukemarine Member
From: 神奈川 Registered: 2007-07-15 Posts: 2347

You're just looking at it differently than me.

For the story method I mentioned, it's just using Anki as a pseudo scheduler. You're not "grading" anything, just a way to line by line read the entire script of a TV show or movie. Heck, you can easily just suspend the entire show and wait till you want to read it again. Glowingfaceman suggested something similar when he uses Anki with his music selection. However, with those, the more he likes it, the lower he scores the music so music he doesn't like he hears less. So this is not about memorization, just a different way to schedule stuff.

For the shadowing thing, again, I think there's a way with Audacity or another program to automatically make a shadowing .mp3 so long as you have the lines to give it. You're not having to cut lines since the subs2srs or iKnow or other files are already 1 or 2 lines of dialogue long. If it's a pain in the butt to do, I won't do it (part of the reason I bite at mining sentences is it's too much effort).

For what it's worth, the thread on people providing audio samples of themselves helped remind me the strength of shadowing. We have to get used to speaking, but we don't want it to only be when we're talking to other people (ie being creative), cause that builds mistakes over time. I thought back to my time with Pimsleur (which I bad mouth now) and do realize I was speaking alot. Now, it's only when I do my SRS but I have more free time in the world.

Anyway, if I make shadowing .mp3's from iKnow sentences I'll try to post them. They'll be about 3:30 seconds, with even amounts of spoken to silence. That should be roughly 20 sentences, so 10 .mp3's per unit.

Reply #33 - 2009 March 08, 6:48 pm
Nukemarine Member
From: 神奈川 Registered: 2007-07-15 Posts: 2347

http://www.snapvine.com/sb/096596c40c3b … 30485c71be

The above link is a sample shadow mp3 I made using iKnow audio clips. I did it using audacity.

Steps:

1. Under projects, import audio (I think 40 files is enough)
2. Goto last file, press f5 (time tool), and "pull" the audio to a time that is twice the time of the audio file above it (ie, if clip above it is 3 seconds, put cursor at 6 seconds). Make it longer if you think more silence is needed for shadowing.
3. Press f1 (select tool)
4. Partially select audio from bottom audio (that you just aligned) and audio above it with your mouse.
5. Under Projects: Quick mix them.
6. Wash wash, repeat till they're all merged.

If anyone has a much quicker way of doing this, I'm all ears.

Last edited by Nukemarine (2009 March 08, 7:13 pm)

Reply #34 - 2009 March 09, 5:42 am
nac_est Member
From: Italy Registered: 2006-12-12 Posts: 617 Website

ghinzdra wrote:

2 it leaves behind unused audio samples in your media folder . As your anki file is meant to stay by you for some years  , I think one should be wary about the media folder size . Take only what is  necesseray

Anki has the function to automatically check for unused media (or cards with missing media) and handle/delete them. Just thought you'd like to know smile
It's in tools/advanced.

Reply #35 - 2009 May 23, 10:56 pm
Nukemarine Member
From: 神奈川 Registered: 2007-07-15 Posts: 2347

Well, I started using subs2srs with "Zettai Kareshi" ep 01. Not quite sure how to describe the way I'm mining it.  It's not about getting a sentence with a new word or phrase, otherwise I'd have stuck with iKnow's Core 6k. A lot of the sentences I know all the words and sort of what the sentence means. It was that 'sort of' that determines if I use the sentence.

Pretty much it's: Look and listen to the sentence to see if it's ok to use. Go in what's good or new about it. Use Kenkyusha e-dictionary for J-E definitions if necessary. Since I know many the words, I'm using the nifty Kojien e-dictionary to put j-j definitions of words. Finally, I write out the sentence (the ONLY time I'll write it out) before giving it a "three" score.

Right now, I'm pretty much using every sentence. As I get used to simple dialogue, then I'll probably become more selective. After zettai Kareshi episode 1, I plan to do Rookies Ep 01, Last Friends ep 02 then Hana Yori Dango ep 01 on the idea this'll spread out the type of dialogue I'll be experiencing (still pretty female heavy though). Reason I'm using the above shows is the subs for them are text based so it's easy to copy/paste/edit sentences.

Way different than the initial thought of just shadowing. Now to see if I stick with it.

Last edited by Nukemarine (2009 May 23, 10:57 pm)

Reply #36 - 2009 May 24, 12:36 am
kazelee Rater Mode
From: ohlrite Registered: 2008-06-18 Posts: 2132 Website

@nukemarine

Last Friends, LOL, that was one of the funniest shows I've ever watched. I know it wasn't intended, but the music that played whenever they were showing that one guy's shoes threw me over the edge.... with laughter. Though it soothed my inner sadist, I got rid of the series after I watched the last episode, やっぱり.

I pretty much mine the same way, cept I write sparsely and don't hesitate the delete cards with simple things like やっぱり, 何で, もういいよ, ええ, and はぁぁぁぁい, on them.

Reply #37 - 2009 May 24, 4:57 pm
Nukemarine Member
From: 神奈川 Registered: 2007-07-15 Posts: 2347

kazelee wrote:

@nukemarine

Last Friends, LOL, that was one of the funniest shows I've ever watched. I know it wasn't intended, but the music that played whenever they were showing that one guy's shoes threw me over the edge.... with laughter. Though it soothed my inner sadist, I got rid of the series after I watched the last episode, やっぱり.

I pretty much mine the same way, cept I write sparsely and don't hesitate the delete cards with simple things like やっぱり, 何で, もういいよ, ええ, and はぁぁぁぁい, on them.

It's like Rookies, it starts off pretty well then heads down hill by episode 4 or 5. However, I did enjoy it so it's a useful show for mining. The dialogue feels real (though I'm a poor judge on that).

Things I'm noting (after only 4 days): takes about an hour to add 20 cards. As I go through them, the plot begins to make sense. Odd considering I've seen it twice and read the script once already. Makes me realize the mind makes connections that are not there in absence of understanding.

Reply #38 - 2009 June 06, 11:59 am
Nukemarine Member
From: 神奈川 Registered: 2007-07-15 Posts: 2347

Ok, pretty much finished Episode 01 of Zettai Kareshi. As it's only been 2 weeks into this, there's not much to report. However some observations:

1. I now have 4 decks: One for Kanji (mainly RTK), one for sentences for Japanese grammar (mainly Tae Kim's), one for sentences for Japanese vocabulary (mainly iKnow), and now one for Sentence method.
2. Using J-J definitions in the answer slot for words you already know is cool.
3. In my vocabulary deck, I suspended all the Core 6000 sentences. Then as words pop up in my sentence mining, I'll activate them in the vocabulary deck if they're there. So far, only 80 words or phrases activated.
4. It was REALLY FRACKING COOL to watch that episode without subtitles and follow along.
5. The question side shows: Picture, audio, kanji sentences and kana sentence.
6. The answer side has definitions if needed. Plus I type out the sentence. I stopped writing out the sentence initially as it seemed pointless. I get enough writing practice from my vocabulary deck.
7. I'll merge some cards, which works well. There's sometimes some overlap on the audio, but you get better context for the sentence, which is great for one word replies you'd normally delete.

Anyway, I just need to finish up the last 60 sentences and move on to Rookies ep 01.

And you know what, I'm looking forward to it cause it seems like fun.

Last edited by Nukemarine (2009 June 06, 12:01 pm)

Reply #39 - 2009 June 06, 12:05 pm
radical_tyro Member
Registered: 2005-11-19 Posts: 272

Hey Nuke, why do you show the kanji and kana sentences on the question side? It would improve your listening ability if you put them on the answer side.

Reply #40 - 2009 June 06, 12:20 pm
Nukemarine Member
From: 神奈川 Registered: 2007-07-15 Posts: 2347

Bit of a personal preference here. On my vocabulary deck, I have audio question with the answer written. With this, I'm not sure, it feels like I'm wanting to train my reading, but using dialogue to do it. If I did audio only, it takes longer to dictate (even with typing).

Anybody that's done audio on question side with subs2srs care to comment on their experiences?

Reply #41 - 2009 June 07, 5:40 am
albion Member
From: England Registered: 2008-05-25 Posts: 383 Website

Taken from the other subs2srs thread:

magamo wrote:

As for Code Geass, one of members from UK had been working on the series and some Japanese members helped him a while ago. I haven't seen him for a couple months, though.

That was me, actually. I had stopped while I got around to watching R2 (so I could look things up without getting it spoiled for me), and lately I've been busy with exams (which end in about 2-3 weeks).

I watched R2 raw, but I have .mkv files (Japanese DVD rip, fansubbed) for the first season. Is there a way to extract the subtitle files from them (then replace the English subs with Japanese)?

Reply #42 - 2009 June 07, 5:55 am
Tobberoth Member
From: Sweden Registered: 2008-08-25 Posts: 3364

albion wrote:

Taken from the other subs2srs thread:

magamo wrote:

As for Code Geass, one of members from UK had been working on the series and some Japanese members helped him a while ago. I haven't seen him for a couple months, though.

That was me, actually. I had stopped while I got around to watching R2 (so I could look things up without getting it spoiled for me), and lately I've been busy with exams (which end in about 2-3 weeks).

I watched R2 raw, but I have .mkv files (Japanese DVD rip, fansubbed) for the first season. Is there a way to extract the subtitle files from them (then replace the English subs with Japanese)?

http://www.bunkus.org/videotools/mkvtoolnix/

MkvMerge in that package can remove/add subtitles to .mkv files.

Reply #43 - 2009 June 08, 7:16 am
bombpersons Member
From: UK Registered: 2008-10-08 Posts: 907 Website

Is anyone still interested in making site for downloading premade decks?
Maybe we could have a thread with links to the decks (rapidshare etc). I don't know the legal ins and outs though...

I've uploaded a deck for FF: Advent Children, is it ok to post it here?

Reply #44 - 2009 June 08, 8:05 am
Nii87 Member
From: Australia Registered: 2009-03-27 Posts: 371

Similarly, I just finished making a deck for Death Note Ep 1 if anyone wants.

More importantly, what's on the answer side for you guys? I have audio, video, image, kanji expression... But no english translation. What do I do?

Reply #45 - 2009 June 08, 8:21 am
bombpersons Member
From: UK Registered: 2008-10-08 Posts: 907 Website

What I do is add in word definitions / readings as I go through them, if I already know all the words then I delete the card.

Reply #46 - 2009 June 08, 10:18 am
Nukemarine Member
From: 神奈川 Registered: 2007-07-15 Posts: 2347

Seriously, unless your computer is incapable of creating your own deck, I don't see a pressing need to share these types of cards. They seem too easy to make. 

Nii87, I actually deleted the block for English translations on my cards. It made it difficult for many of the sentences, but it prevented the false comprehension that comes with having your main language there.

Now my job is to comprehend the sentence. So just knowing all the words in it is not enough, I have to know what the meaning is as a whole. It's almost like translating the entire episode, but not really. But if the English translation (ie somebody else's hard work) is there, I don't feel the need to put in the hard work myself (figuring out the sentence). Does that make sense, cause I just confused myself.

Reply #47 - 2009 June 08, 7:00 pm
Nii87 Member
From: Australia Registered: 2009-03-27 Posts: 371

But isn't translating between languages which you don't understand perfectly a problem in the long run? I think I read that on AJATT, but I think he was talking about English->Japanese. Feel free to enlighten me though.

Reply #48 - 2009 June 11, 7:23 pm
cjswanson1355 Member
Registered: 2008-07-29 Posts: 54

So I am trying to get a deck of Mononoke Hime set up, but I am having a lot of trouble with the subtitles.  I downloaded the subtitles located at http://kitsunekko.net/JapaneseAnimeSubtitles.aspx , but it's in the .sub format, and subs2srs will not recognize it.  I have tried some converting programs, and all of them either do not recognize the kanji or simply ruin the encoding. Does anyone know how to fix this problem?

Reply #49 - 2009 June 11, 8:15 pm
ahibba Member
Registered: 2008-09-04 Posts: 528 Website

cjswanson1355 wrote:

but it's in the .sub format, and subs2srs will not recognize it.  I have tried some converting programs, and all of them either do not recognize the kanji or simply ruin the encoding. Does anyone know how to fix this problem?

You must OCR this type of subtitles. Check this:

http://www.d-addicts.com/forum/viewtopic.php?t=16017

Reply #50 - 2009 June 11, 8:29 pm
ahibba Member
Registered: 2008-09-04 Posts: 528 Website