yogert909 Wrote:I highly doubt 90% of core6k regularly shows up in anime. As I mentioned earlier, I've been comparing wordlists of various corpora and the overlap is a lot less than I expected. It's a project that I'm still working on when I have time between work and studying (I'm still studying core actually..) so I hesitate to throw precise numbers out there. But I will say that I've compared core 6k against the several thousand subtitles on dramanote and there's not one episode that core covers even 60% of unique words. The average coverage seems to be in the low 40% range.
Is that core 6k alone (4,000 words)? How is the coverage of core 2k+6k on those dramanote subs? Also I've found drama to be more challenging than anime, so when you have the time, it would be nice to know how much of strictly core 10k and how much of core 2+6+10 is really in your dramanote corpus for a better overview.
But yes, I don't have hard numbers, it's just my impression. Browsing through 2k, and 2k definitely feels like 90%. I've methodically browsed through about 5% of 6k and so far the average coverage seems to be 80% (once you've seen a word in the wild, the feeling "seen in the wild" remains), some I even heard in Gingitsune just a few days ago (I watched that with English subs, but I can still listen).
That said you're not the first one to express disappointment in core -- it's usually from people who try to read LN/novels as I recall -- so I guess I can count myself lucky not to have experienced it. As broad a tool as core might be, it's common sense that statistically it shouldn't be useful to some, but I'm not convinced those are the majority yet.
yogert Wrote:This question for me seems to boil down to enjoyment. Clearly it would be useful to know every word of core10k. But wouldn't it be more enjoyable if I studied 2k words and could understand every word of an anime or manga or news article? If I have an electronic text, it's not hard to use a few tools like cb's japanese text analysis tool to make myself a core norwegian wood, or a core totoro, or a core whatever. I don't know how many unique words there are in the average anime movie - but it's got to be around 2k, manga even less, and a news article, a few hundred if that. What if I study those words and then enjoy Japanese a lot sooner? Then learn some more words and enjoy another book or movie? I'd still learn 10k common words, but it would be much more enjoyable.
Yes, but some of us can't use those tools. Plus as a beginner I was often overwhelmed, and I was more interested in a general understanding skill than just understanding Totoro. Plus, let's say there are 2k unique words in a movie? 2k is two months of learning, meaning you would spend 2 months on Totoro? In a way it's almost more boring than core.
There also the problem of learning the words before you enjoy the material (at least for me enjoyment is directly linked to understanding). Regardless of where the words are from, you still have to grind through the gruesome task of ankying them. This is also why intensive reading (of subs) is so much more enjoyable for me because I barely anki anything: just do a look-up and move on, it will stick eventually. But I just couldn't imagine that doing that at low-level (because the better you get, the more any text tend to look like an i+1, and even if it doesn't you have the power to parse your own i+1 from an i+4. As a newbie whenever there's too much information you break down).
Edited: 2015-06-11, 3:12 pm