![]() |
|
smart.fm core 10000 - Printable Version +- kanji koohii FORUM (http://forum.koohii.com) +-- Forum: Learning Japanese (http://forum.koohii.com/forum-4.html) +--- Forum: Learning resources (http://forum.koohii.com/forum-9.html) +--- Thread: smart.fm core 10000 (/thread-3924.html) Pages:
1
2
|
smart.fm core 10000 - ruiner - 2009-09-04 Even though it's one of those things where once you reach that point, you realize you don't really need it, as a completionist I'd be tempted to go through it anyway... anyone know anything about when this is coming out? Last I checked, it was supposed to be out last year. smart.fm core 10000 - cracky - 2009-09-04 I was wondering if it's still coming out also. I was pretty sure they said they were working on the audio for it a long time ago, don't know what happened. smart.fm core 10000 - radical_tyro - 2009-09-04 i disagree about not really needing it. i think it's a useful way to learn more vocab, and more vocab is always good. i'm hoping it's ready by the time i finish core 6000 (i'm on lesson 7 now) smart.fm core 10000 - Nuriko - 2009-09-04 There are a lot of words that I look up on iKnow and hear and/or read on a daily basis, and they're not part of the core 6,000. I feel like I've done this so many times. I do these searches to get audio clips, and I always hear the boy-going-through-pubery robot lady voice. Anyway, I feel like there's a lot not being covered by the 6,000 list, although it is indeed a huge list. So that extra 4,000 is probably something everyone should study, I think
smart.fm core 10000 - Nukemarine - 2009-09-05 Here's been my experience over the last four months. I suspended all the Core 6000 cards. Then during sentence mining if a new word comes up, I check if the word is an entry in the Core 6000 deck. If it's an entry, I unsuspend the card. If it's not an entry, but used relevantly in a sentence, I type the word into the vocab portion next to the word the entry is there for and unsuspend the card. On top of that, if a word in my Tae Kim deck or even in a unsuspended Core 6000 card is it's own entry, I unsuspend those cards too. So, I've unsuspended about 600 words (15%) in the last 3 months (it seems slow, but don't forget I'm also sentence mining). I looked at the list recently and I found: Of the 500 entries for Steps 1 and 2 in Core 6000, I only have activated 120 words (20%). That's telling me that there are words that are useful but at the end of the Core series. There are words that are useful but not in the series at all (one of the reasons I added Tanuki entries). For fuller discloser, I did try to do the Core 6000 entries straight through. I quit at entry 80 at the sheer difficulty it presented (you know, rarely used words outside of economic or political discussions). I'm pretty sure if someone posted subtitles to "Change" (about a guy becoming Prime Minister) then these words would get added naturally should I mine that show. It just didn't seem "fun" just adding them and missing them so much with no reference to them outside of the sentence itself. Not saying to not go through word by word and study Core 6000. However, it may be more enjoyable if you go another route, then unsuspend as necessary. One could be reading "Norwegian Wood" and activating entries for new words you come across there. I just do it with dramas I'm mining/dissecting. In addition, I'll add the Core 10,000 deck if it ever comes out. But something tells me the use I'll get out of it will be less so. However, what will be useful will be there for me to grab onto. smart.fm core 10000 - MeNoSavvy - 2009-09-06 Thanks for the report Nukemarine. Definitely interesting observations. Do you by any chance know what % of the Core 2000 you have unsuspended? Some steps in the Core 2000 seem to have a lot of business and political terms also, although generally the vocabulary appears to be more useful. While the resources provided in Core 6000 are definitely helpful, I too wonder if the frequency list they used was not the best. It is obviously derived from news reports or similar hence the large number of business and political terms. Perhaps later you could share your list of cards that you have unsuspended, once you have done some additional sentence mining. I wonder if there are any better frequency lists available or indeed other lists that are more suitable for learners than the core 6000, one could then generate one's own deck using the Cerego (sp?) resources. Personally I'm not sure I can be bothered going through the whole of the Core 6000. By the way does anyone have any good information about the anki api? I can't find anywhere that explains how to create decks, add cards etc programmatically. Also are there any good books (in english), or good websites that explain the different standards used for Japanese characters etc. All the different standards are kind of confusing. I want to start doing japanese text processing as some point. There are so many good resources these days, it would be useful to be able to grab a list of words from one source, grab a sentence and audio from another, and perhaps some additional information from yet another etc, but to do it all programmatically. smart.fm core 10000 - Codexus - 2009-09-06 There are some stupid words in the core lists. For example 大蔵省 is very near the beginning of core 6000 and not only is it rather obscure but apparently there is no longer a 大蔵省 in the current Japanese government (it's called something else now) ![]() But overall, if you put those few exceptions aside, most words in the smart.fm lists are useful. I used to easily dismiss them as being too business/finance related until I noticed a lot are used in a casual context too. As for not needing more than core 6000, that's ridiculous. The majority of the words I add to my SRS are not in the core 6000 list. And I don't find them in obscure sources, as I mostly read manga and short novels targeted at children. My current goal for a vocabulary that should allow me to understand most Japanese is around 20'000 words. Yes, it's a made up number but based on some articles I read on language acquisition and statistics of the number of words in some texts it seems reasonable. And it just helps to have a goal even if once I reach it, I find out there is still a lot to be learned. smart.fm core 10000 - Nukemarine - 2009-09-06 @MeNoSavvy, I studied all of the Core 2000 deck. Turns out, the easier words were near the end of that group, so the last 500 words went by very fast. It was in Core 6000 that I just gave up trying to deal with their order of words directly. As for sharing my results, it's not going to be useful showing the group of words. It all depends on the anime or J-drama I choose to dissect (mining is no longer the correct term for me). Had I gone with different shows, the words being activated would be different. That said, a legitimate word frequency list based upon scanning Drama and Anime scripts would be a huge boon. I'm talking something about the most common 1000 to 2000 words outside of grammar and particles. To make them further accessible, organize the frequency list into the KO2001 kanji order. I been hoping this for awhile now, but only thing lacking is a tool that can sort Japanese words on the KO2001 order. I too would LOVE if Anki somehow got the ability to interact directly with SMART.FM's enormous cache of vocabulary and sentences. It seems easy enough if Anki just tracked item entry numbers in your account, but uses those numbers to access the correct material from Smart.FM to display as you choose. Anki would be the cooler, more variable older brother of the iKnow application. @Codexus, I agree. If you went item by item through Core 6000, it would seem stupid as they are based on that newspaper frequency. On the other hand, like you said, the casual words are in there, just hidden. Thankfully, Anki's search feature has come such a long way in the last two years that using Core 6000 and Tanuki as a corpus works great. Probably annoying to Damien though since my deck is 35 meg in size, with 60% still suspended. There are many words I'm coming across that are not in Core or Tanuki series. Not sure if they're useful, but without them I cannot follow the drama. smart.fm core 10000 - ruiner - 2009-09-06 I don't care much about the 'order' of c6k as I was going to simply finish it as a whole on the side, but now I'm thinking that if I'm doing video-viewing decks only (just making sure I can understand the audio/semantic context in relation to the moving pictures, not spelling or anything), then a nice supplement might be to use the c6k deck as a supplementary reading/writing corpus for the stuff I do in the video deck. Originally I was just going to use the shared sub2srs audio/image decks for this, but since those are limited for now, I might as well do it with c6k as well. Also Nuke, did you see my recent post in the subs2srs thread on audio splicing? Curious if you have any workarounds for that. Do you just listen to the timings then cut the cards to avoid opening/ending songs and cut-off lines? Or not bother. smart.fm core 10000 - Nukemarine - 2009-09-06 @ruiner, I'm still not quite sure what you're suggesting. Is this for cards in the SRS? What I was doing was merging cards from time to time by moving audio and text to the same card. On others where audio was cut off short or started late, I would trim the text on the card. This was a bit tedious and didn't accomplish much so I stopped. For songs, I recently got lyrics and made audio clips to put into Anki. Figured if I'm going to listen to the drama, may as well understand the sound track too. smart.fm core 10000 - ruiner - 2009-09-06 Nukemarine Wrote:@ruiner, I'm still not quite sure what you're suggesting. Is this for cards in the SRS? What I was doing was merging cards from time to time by moving audio and text to the same card. On others where audio was cut off short or started late, I would trim the text on the card. This was a bit tedious and didn't accomplish much so I stopped.I mean for putting audio splices on the iPod. smart.fm core 10000 - nwatkins - 2009-09-06 Nukemarine Wrote:That said, a legitimate word frequency list based upon scanning Drama and Anime scripts would be a huge boon. I'm talking something about the most common 1000 to 2000 words outside of grammar and particles. To make them further accessible, organize the frequency list into the KO2001 kanji order. I been hoping this for awhile now, but only thing lacking is a tool that can sort Japanese words on the KO2001 order.Looks like some ppl where working on a similar thing. http://forum.koohii.com/showthread.php?tid=3216&page=2 smart.fm core 10000 - Thora - 2009-09-06 Nukemarine Wrote:...using Core 6000 and Tanuki as a corpus works greatI'm curious how people are using the Tanuki list. The sentences and mini definitions are a great idea, but it looks like it would require a lot of work to convert all the hiragana to kanji to be useful. The sentences contain too much hiragana (I'm guessing the omitted kanji were expected to be learned later in the book/program.) Also, the words have the target kanji in hiragana (eg. compounds of mixed hiragana and kanji.) smart.fm core 10000 - bodhisamaya - 2009-09-06 They are adding new Cerego sentences every day so I assume it is for the purpose of reaching 10,000. http://smart.fm/sentences I am beginning to think the most efficient way to use smart.fm is to go to a list, click sentences, and just listen to sentences all day while reading the kanji rather than going into the iknow application. It kind of defeats the SRS but more sentences are practiced with audio and kanji in context per hour with this approach. smart.fm core 10000 - ruiner - 2009-09-06 bodhisamaya Wrote:They are adding new Cerego sentences every day so I assume it is for the purpose of reaching 10,000. http://smart.fm/sentences.Actually the last Cerego sentence added was mid-July, and those were for a classic movie list. I get the feeling they're not going to bother carrying on w/ the Core series. Last they blogged about it was before they changed from iKnow to smart.fm, perhaps that's part of the change. smart.fm core 10000 - Nukemarine - 2009-09-06 Thora Wrote:Well, since I'm using it as a Corpus (am I using that word right?), I only convert hiragana on the entry I'm activating if it's necessary. I'm in the habit of searching both the hiragana and kanji version of the word in addition to a portion of it. To be honest, Core 6000 usually has the word I'm looking for so far.Nukemarine Wrote:...using Core 6000 and Tanuki as a corpus works greatI'm curious how people are using the Tanuki list. The sentences and mini definitions are a great idea, but it looks like it would require a lot of work to convert all the hiragana to kanji to be useful. The sentences contain too much hiragana (I'm guessing the omitted kanji were expected to be learned later in the book/program.) Also, the words have the target kanji in hiragana (eg. compounds of mixed hiragana and kanji.) smart.fm core 10000 - Thora - 2009-09-06 I see. I actually started converting it b/c I thought it'd be a great deck for people to have. (In its current form, it's kind of confusing.) But the tedium was too much for me. Then I thought it'd make more sense to add the short sentences and definitions to the existing KiC vocab deck. But I couldn't see an easy way to automate that process either. I suppose having the smart.fm stuff with audio probably makes such a Tanuki/KiC deck less appealing, eh? I thought of it because I think KiC contains more words. (around 9000) smart.fm core 10000 - bodhisamaya - 2009-09-06 ruiner Wrote:They are doing something with sentences. That page is for newly created or edited sentences and I have noticed several from Cerego every day I look.bodhisamaya Wrote:They are adding new Cerego sentences every day so I assume it is for the purpose of reaching 10,000. http://smart.fm/sentences.Actually the last Cerego sentence added was mid-July, and those were for a classic movie list. I get the feeling they're not going to bother carrying on w/ the Core series. Last they blogged about it was before they changed from iKnow to smart.fm, perhaps that's part of the change. smart.fm core 10000 - travis - 2009-09-06 Some of those recent sentences seem to be for Japanese people learning English. They could work both ways, but the audio is English. smart.fm core 10000 - ruiner - 2009-09-06 travis Wrote:Some of those recent sentences seem to be for Japanese people learning English. They could work both ways, but the audio is English.Also, a cursory look tells me that whenever there's activity on an old sentence (like a French translation), it seems to appear on that /sentences page. I was basing the July 15th date on Cerego's user page, the most recent 'created a sentence' (which seems out of date, but I still think the above holds true.) Also, I don't think it'd make sense for them to just release random piecemeal sentences to the public, if they were doing another Core series I think they'd wait and then add it all at once and make a big deal out of it. smart.fm core 10000 - Nukemarine - 2009-09-06 Thora Wrote:I see. I actually started converting it b/c I thought it'd be a great deck for people to have. (In its current form, it's kind of confusing.) But the tedium was too much for me. Then I thought it'd make more sense to add the short sentences and definitions to the existing KiC vocab deck. But I couldn't see an easy way to automate that process either.That's why I merged the two (Core 2k and 6k with Tanuki). If it's not in Core deck but in Tanuki, I activate the card in Tanuki. I'll change kana words to kanji if applicable. In addition, I'll add in a simple english definition of the Japanese one is not too obvious. Once I stopped worrying about "having to know" so many words, I actually began adding many words faster than I thought I would. Plus, as these words are either from cards from subs2srs or a new word used in an activated card from Core 2000 or Tanuki, I'm getting double or triple reinforcement combined with the knowledge the word I'm testing is immediately useful somewhere else. I honestly think we have more than enough material "compiled" to get the beginning self-studier well on their way. By this point, it's about what you can do for yourself. So even a "kanjified" Tanuki list would be overboard. Although advanced material group projects like Kanzen Master show that even higher levels can have guided material to use. @Ruiner, ok you were talking about putting audio splices back on the iPod. Yeah, I decided not to go that route. Just having the drama broken into smaller 3 1/2 minute segments has been more than enough benefit. If theme songs were a real big problem, I'd have snipped it out in Audacity prior to segmenting the show, but haven't felt it necessary now. With 80 one-hour episodes spanning 20 different shows, I think I'd burn out trying to snip out every annoying bit <@_@> smart.fm core 10000 - ruiner - 2009-09-06 Nukemarine Wrote:@Ruiner, ok you were talking about putting audio splices back on the iPod. Yeah, I decided not to go that route. Just having the drama broken into smaller 3 1/2 minute segments has been more than enough benefit. If theme songs were a real big problem, I'd have snipped it out in Audacity prior to segmenting the show, but haven't felt it necessary now. With 80 one-hour episodes spanning 20 different shows, I think I'd burn out trying to snip out every annoying bit <@_@>So you end up listening to a bunch of theme songs mixed in with the speech? I don't think I could handle that. Do you never find having lines cut off at the beginning and end of segments annoying? I'll probably end up marking off 1-3 minute sections in Audacity after snipping the OP/ED songs, then listening to make sure there's a silence at each marker before exporting as multiple mp3s. I think 3:30 is too long, will probably stick with 1-1.5 minutes. smart.fm core 10000 - Blahah - 2009-09-06 Nukemarine Wrote:Well, since I'm using it as a Corpus (am I using that word right?Not quite, a corpus is just a collection of writings, usually from one author or group. You can't use something as a corpus, it either is a corpus or not. In this case, it is
smart.fm core 10000 - cangy - 2009-09-07 Nukemarine Wrote:I studied all of the Core 2000 deck. Turns out, the easier words were near the end of that group, so the last 500 words went by very fast. It was in Core 6000 that I just gave up trying to deal with their order of words directly.after the first 2 steps of core 2000 the order is pretty random, so I just suspended them all and went through and picked out the more interesting ones. rinse, repeat the thing I really found annoying about the order, or lack thereof, is the uncontrolled introduction of kanji, making learning the readings unnecessarily difficult. so, for the next ones (core 6000, ko2001) I've been thinking about sorting the sentences based on contained kanji, using various ordings, such as ko2001, RTK, frequency, or a derived order for the collection which will result in the most sentences matched per additional kanji... Nukemarine Wrote:To make them further accessible, organize the frequency list into the KO2001 kanji order. I been hoping this for awhile now, but only thing lacking is a tool that can sort Japanese words on the KO2001 order....so I have a perl script that will do that, which I'll put up soon when I get a chance to finish the analysis, but you can have it now if you want to play with it and have a suitable environment, or I can just do the sort if you have a list ready smart.fm core 10000 - trusmis - 2009-09-07 Maybe irrelevant here, but where do you get the subtitles for the sub2srs program? Inputting text in the front, back, click save doesnt take a lot of time? I also after some time of core6000 processing decided it was far better to input my own words that made sense in the stuff I am seeing/reading. Chances that a word you read in a book appears again in the same book is far greater than a random core6000 appearing in that book. |