Back

subs2srs video clips in anki

#1
So I figured this topic deserved a better title than 'wait a second'. You can have videos in Anki. What do you think about it? Have you tried it? How are you using it? I'm interested.

Here's some ideas: http://forum.koohii.com/showthread.php?p...8#pid67968

I think instead of using subs2srs for audio and images, I'm just going to turn episodes into a series of videos and use the videos for dictation* before watching episodes--that is to say, video on the question side and text/meaning on the answer side, to 'prime' myself for the episodes, and less often I'll make 'call-and-response' (double-sided, as IceCream would say) cards if while watching something I see cool dialogues I want to use to practice output.

But in that case I might have video on both front and back as described in the comment linked above, because both 'video' and 'audio' facts in Anki can play videos. Replay/F5 replays them both, so you have less control, but it's in order (Question then Answer), so it's like it recombines the dialogue, and that's not too bad.

If I do the 'priming' thing, I'll probably also extract the audio to tinker around with later, though I think instead of splicing the audio and listening to it before watching like Nukemarine, I'll listen to it afterwards, when the cards have matured in my brain. ^_^ Or maybe I won't bother extracting audio, since my old iPod and just about every other device can play videos...

I'll still use subs2srs decks with image/audio/text, but mostly as a corpus, a database to find cards for cool words I come across.

Anyway, this week I'm transitioning from Phase 1 to Phase 2, so that's why I've been brainstorming like crazy. I tend to think in a weird way, so feel free to say something practical and obvious that makes me go 'Orz'.

*Not the best word here, I don't mean always copying the text down. Could be just for listening practice, in which case the text would just be there if I wanted/needed it.
Edited: 2009-08-26, 7:44 pm
Reply
#2
I was quite excited about the whole subs2srs corpus discussion we had. But since then I've actually yet to find a word I needed in my corpus. Perhaps the problem stems from the fact that the corpus lacks diversity (its just Death Note and FMA at the moment, but its all of the episodes of each. So there is probably 100+ hours of material there). Or perhaps from the fact that I'm working through JLPT1 material, and not likely to find those sorts of words in most common forms of media.

As for videos, I don't find them that exciting to be honest. I usually prefer to make do with audio/picture and smaller file sizes. The clips generated by subs2srs are a bit short for video clips to be very fulfilling to watch, imo.
Reply
#3
Nothing to add from our previous discussions on the subject. Just be sure to let me know when you find out it isn't effective Tongue
Reply
May 16 - 30 : Pretty Big Deal: Save 31% on all Premium Subscriptions! - Sign up here
JapanesePod101
#4
blackmacros Wrote:I was quite excited about the whole subs2srs corpus discussion we had. But since then I've actually yet to find a word I needed in my corpus. Perhaps the problem stems from the fact that the corpus lacks diversity (its just Death Note and FMA at the moment, but its all of the episodes of each. So there is probably 100+ hours of material there). Or perhaps from the fact that I'm working through JLPT1 material, and not likely to find those sorts of words in most common forms of media.

As for videos, I don't find them that exciting to be honest. I usually prefer to make do with audio/picture and smaller file sizes. The clips generated by subs2srs are a bit short for video clips to be very fulfilling to watch, imo.
There's more than DN and FMA, but yes, we need to keep adding to it by making more decks. I'm hoping to eventually make/upload around 50 films, depending on how lazy and generous I'm feeling. Having the programmer-gods constantly contributing learning software/materials makes me feel like reciprocating, though.

True dat, the video decks are kind of big--one episode of Code Geas was 70mb, though I'll likely prune that down. That's why I'll probably use them sparingly, an episode or movie a week or every couple weeks.

I'm not sure what you mean by fulfilling? I just prefer to have moving pictures integrated with the sounds instead of an image and the audio, ie the study material to be the same format as the material to be watched. (If I'm studying sentences so I can read a novel, I want text w/ audio so I know I'm subvocalizing it correctly. If I'm studying sentences for watching a video, I want video w/ text so I can make sure I'm listening correctly.)
Reply
#5
Tobberoth Wrote:Nothing to add from our previous discussions on the subject. Just be sure to let me know when you find out it isn't effective Tongue
You'll be the first to know, Tobbs. What's funny is, one day my decks will likely look like yours. But not till I'm at a higher level...
Reply
#6
Sorry I meant that *my* corpus was only DN and FMA right now. I haven't gotten around to adding more stuff yet.

By fulfilling I meant that the video clips are usually so short that they end too quickly and I don't find it worth the effort/worth watching. That doesn't really make sense, does it...? Never mind I'm just weird...
Reply
#7
blackmacros Wrote:Sorry I meant that *my* corpus was only DN and FMA right now. I haven't gotten around to adding more stuff yet.

By fulfilling I meant that the video clips are usually so short that they end too quickly and I don't find it worth the effort/worth watching. That doesn't really make sense, does it...? Never mind I'm just weird...
That's a good point, actually. I mean even if it's very brief, I still prefer having video, but I was thinking I'll probably try and find a sweet spot for the padding, perhaps even extending it so that the [previous line] idea (from IceCream's thread and cb4960's comment) kind of comes into play--ie effectively the previous line/current cue are in the video clip on the Question side...

Oh and of course you can tell subs2srs to only use lines with kanji, of a minimum length, et cetera, and can prune super-short lines. Plus now that I check, the video deck isn't that much bigger than image/audio decks...
Edited: 2009-08-26, 7:50 pm
Reply
#8
Ruiner Wrote:...use the videos for dictation* before watching episodes...
You might want to try watching the show first to see what you can understand and infer. Perhaps repeat part of it a couple of times before checking out the subtitles for that part. I have no idea why that's supposed to be good, but that's what they say.
Reply
#9
Thora Wrote:
Ruiner Wrote:...use the videos for dictation* before watching episodes...
You might want to try watching the show first to see what you can understand and infer. Perhaps repeat part of it a couple of times before checking out the subtitles for that part. I have no idea why that's supposed to be good, but that's what they say.
Yes, definitely, I would watch them first, to start off with a good conception of the flow of the story, but mostly because I'm neurotic about spoilers. (Edit: I mean watch them first w/ subs, then 'raw' after priming.)
Edited: 2009-08-26, 8:33 pm
Reply
#10
IceCream Wrote:yeah, i agree with everything blackmacros has said there. Video's for short lines kinda annoyed me actually. Even though, theoretically, video should work better for catching non verbal cues, when the clips are short, they are kind of hard to watch, i dunno why. i guess audio + picture are enough to place me in the context. If you were doing longer samples like you and Thora were discussing, it would be worth it, i think...

Also, i don't think that Sub2SRS corpuses are really worthwhile, i think i mentioned it in the other post. I think if you're looking for a specific word, there's much better resources out there. If you pick a sentence for a word rather than the sentence itself, from watching the context, etc, you're missing out the good stuff about having it in the first place. Your giving up the benefits of clear audio and clear sentences that you can get from smartfm or KO sentences, and not getting much back in return.

I definately think you should watch videos first as well. It allows you to hear things properly the first few times around. If you watch it more than one time, on each watching, you distinguish more words that you should have heard before, that you already know. So, the video just becomes clearer to you in so many ways, which, i guess allows your brain to process other parts of what you're seeing too...

If you prefer short video clips though, you can use them in exactly the same way anyone else uses picture + audio...

(oh, and call and response does sound better than double sided, but i think it kind of overemphasises the kind of learned response / outputting kind of use of it, but i guess it doesn't really matter...)
So far, I only have issues with clips shorter than two seconds, so I'll probably just eliminate those lines when generating clips in subs2srs. I'll probably add a bit of padding too since the way Anki plays them, it's a popup window, so there's a bit of a disconnect that could use some filler space.

As for subs2srs corpora:

Well, I think the point is that learners can have the option to access native materials in a more efficient way, and they can look at the breadth of the language in its natural media context and choose what resonates with them, what they find clear and easy or challenging and striking, as they see fit, without having someone mediate and devise sentences for them. It's more empowering for the user in their self-study pursuits. They can always look to linguistic experts and natives to collaborate and advise them on useful structures and create resources, but it won't be a forced, prescriptive scenario.

My main use will be: I'm reading/watching something, I think "Whoa, I want to learn that!" and if I want to SRS it I can either sample it directly, or look for an example of it in prefab materials, either subs2srs or otherwise. If I could start over, I'd much rather have the option, when I was learning kanji and their readings, to quickly add sentences from sources besides KO2001 and smart.fm.

Once I reach Tobberoth's level waaay in the future, I'll probably just write down the word or something and stick it in Anki by itself... superminimal...
Edited: 2009-08-26, 9:30 pm
Reply
#11
Aww man. I went to Uni for a while, my ideas gestated, and I was ready to come back and present a clear and lucid explanation of what I meant about video clips...but it looks like you guys figured it out anyway Sad

About the subs2srs corpus

IceCream Wrote:f you pick a sentence for a word rather than the sentence itself, from watching the context, etc, you're missing out the good stuff about having it in the first place.
I think you will probably only run into problems if you use a subs2srs corpus of media you haven't actually watched yet. For stuff you have seen before (and particularly stuff you've seen a lot) it strikes me as a great resource for sentences containing not only your target word, but also imbued with a rich context that you're already familiar with. Unfortunately (for me) there aren't a lot of decks made from stuff I'm familar with at the moment; hence the reason I've just got Death Note and FMA. Once I'm through with all my JLPT stuff though I will look to heavily expand that.
Reply
#12
I would love you (or anyone) FOREVER if you made a Tiger & Dragon, IWGP or Trick deck. I've watched those shows to death. I think it might be hard to find Jsubs though...I'm still building up my immersion/media database so those are the only drama I've gotten into so far.
Reply
#13
blackmacros Wrote:I would love you (or anyone) FOREVER if you made a Tiger & Dragon, IWGP or Trick deck. I've watched those shows to death. I think it might be hard to find Jsubs though...I'm still building up my immersion/media database so those are the only drama I've gotten into so far.
There aren't jsubs online for the series you mentioned, there are the scenario books though...I have those but it's to boring to write all of them out with the pc, I suggest you to buy the books for T&D and IWGP
Reply
#14
Uninformed relative newbie question - where are you guys sharing these decks, if at all? NOT in the Anki -> Download Shared Deck area, right?
Reply
#15
@icecream
I love IWGP, although I haven't watched it for a while (I'm working through my anime back-catalogue from my pre-Japanese days, and hiding the English subs). I haven't watched it since before KO2001 in fact. It might be cool to go back and see how much my understanding has jumped actually...

Trying to transcribe it is a great idea I think, and will probably do wonders for your Japanese. Thats what Magamo does/did for English dubs of anime [here: http://www1.atwiki.com/animetranscripts/ ] if I'm not mistaken, and look how good it made his English!

@cescoz I heard that the scenario books are, much like most jsubs, very very imprecise and inexact compared to the actual script. Is this the case?

@rswarsaw http://learnanylanguage.wikia.com/wiki/Sub2srs_decks Smile
Edited: 2009-08-27, 7:49 am
Reply
#16
I bet by the end of the year we'll have a collaborative subs2srs-KO2001 deck, the grammar streamlined (perhaps even categorized?), some other metadata tags related to tropes, genres, politeness levels, perhaps, in addition to previous/next lines for people that have trouble with sentences that don't have extra information about the source context (I've never had that problem, though, and I suspect it's rare and being projected as a larger issue than it is. Although it is useful for dialogue-specific cards. Plus I have been obsessively honing my critical analysis skills with Japanese materials [and otherwise] for years, so perhaps I've got that cultural grammar internalized and am blinded by the curse of knowledge).

Well, it'd be a start, since once you have the ability to structure huge swaths of sentences based on what interests you, not really much point in sticking to frequency--perhaps the new research about 'information value' will replace our frequency obsession, once it's easily applied in regex or whatever. (http://forum.koohii.com/showthread.php?tid=3792)
Edited: 2009-08-27, 8:14 am
Reply
#17
Ruiner, I still stick to sentences though. Not because I need them (well, sometimes you do, it depends on the word. 99% of the time, nouns do not need any context to learn) but because it makes the process easier. I don't think it's faster to study single words than short sentences, but short sentences definitely make the process more fun and gives extra exposure.
Reply
#18
blackmacros Wrote:@cescoz I heard that the scenario books are, much like most jsubs, very very imprecise and inexact compared to the actual script. Is this the case?
No, the books have all the dialogues as spoken in quotation marks plus a little of explanation of the situation. these books are pretty coool Cool
Reply
#19
I do have the raws, although they are quite low quality. I think I got them from [EDIT: Link redacted cos I know Fabrice doesn't like to have to deal with copyright stuff.] I got them from d-addicts, I searched for IWGP with the "raw" filter on. There are 9 seeders right now for the bundle with all the episodes, you'll be able to get it from them much faster than I could upload them somewhere (especially with my pathetically bad Australian internet).

@cescoz that sounds great! I might have to look into finding a copy of them.
Edited: 2009-08-27, 6:06 pm
Reply
#20
isn't there mkv, that's raw minus whatever they had to hardsub
Reply
#21
Holy kanji! I just realized, with video clips in Anki, there's something simple I can do with all those jdorama transcripts I DL'd via wearethecats/jdrama.cc... just make clips and copy/paste. Can't believe I didn't think of that sooner... Orz
Edited: 2009-08-29, 7:09 pm
Reply
#22
IceCream Wrote:which ones do you have?? any good ones to share we can't find on dramanotes?
Actually, that's weird, looks like they're all on dramanote now. They weren't when I first DL'd them, I swear! But I kind of forgot about them anyway, since I was too lazy to time them. Odd how quickly stuff evolves on the web.

BTW, anyone have a big archive of Dramanote's stuff or know whether the site's at archive.org's Wayback Machine? I'm paranoid about it going offline like jdrama.cc and archive.org currently down for me.
Edited: 2009-08-29, 7:36 pm
Reply
#23
Don't you still have to generate the correct timing somehow, to do that? Like, subs2srs needs something to time against to generate clips doesn't it? Maybe using English subs (because they're much more common).
Reply
#24
blackmacros Wrote:Don't you still have to generate the correct timing somehow, to do that? Like, subs2srs needs something to time against to generate clips doesn't it? Maybe using English subs (because they're much more common).
I didn't mean to use subs2srs to create the clips (though if I did, then as you said, I'd use English subtitles, for the gist + timing). I just meant like when I'm watching a drama and know there's a transcript but not Japanese subtitles, I can be like "Oh I'll just cut this clip out and toss the transcription for it into Anki alongside it..." (When I see something worthwhile.)

But I guess I could've done that w/ the audio anyway. But hey videos are better, I feel. And hmm why didn't I just generate clips sooner using Eng subs and then search for the lines in Anki. Where's my head been at, for real.
Edited: 2009-08-29, 7:39 pm
Reply
#25
IceCream Wrote:which ones do you have?? any good ones to share we can't find on dramanotes?
Hardly seems worth the bandwidth, but here's Voice 01-02 transcripts, I didn't see them elsewhere (just Sexy Voice and Robot at Dramanote).

http://www.megaupload.com/?d=Y0T67U6X
Reply