![]() |
|
SRS sucks! - Printable Version +- kanji koohii FORUM (http://forum.koohii.com) +-- Forum: Learning Japanese (http://forum.koohii.com/forum-4.html) +--- Forum: Learning resources (http://forum.koohii.com/forum-9.html) +--- Thread: SRS sucks! (/thread-4201.html) |
SRS sucks! - Transparent_Aluminium - 2009-10-18 Here's another sample of kanji, this time in the 2600-2615 range (according to this list http://www.tidraso.co.uk/kanji_frequency.html): 茉 苺 艇 舛 聘 羹 罫 縣 篆 箙 箏 窈 租 祢 祠 聘 is the only kanji I knew, as part of 招聘. Maybe the Japanese members on this forum could tell us how common they think these kanji are. They all look pretty obscure to me. Edit: ...I just noticed Yudan beat me to this. I guess we're drawing pretty similar conclusions. I'm surprised to see 贅沢 rank so high. Quote:(怯 is almost at 3000, but 怯える and 卑怯 are pretty common, maybe just not in newspapers.)These newspaper based frequency lists are probably biaised towards specialized vocabulary. It would be more useful to also have a list based on novels or webpages. Here's another 15 kanji starting from 2600 from the Wikipedia frequency list (http://forum.koohii.com/showthread.php?pid=55384#pid55384). 妓 瓜 腱 禎 拼 鼠 亥 炒 頬 湛 麒 祠 楯 喘 筈 渚 These seem more common overall: 楯 (old version of 盾?) 炒 亥 鼠 瓜 SRS sucks! - yudantaiteki - 2009-10-18 Transparent_Aluminium Wrote:Here's another 15 kanji starting from 2600 from the Wikipedia frequency list (http://forum.koohii.com/showthread.php?pid=55384#pid55384).This is interesting: "173 kanji make up 50% all kanji in Wikipedia. 454 kanji cover 75% of all kanji in Wikipedia. 874 kanji cover 90% 1214 kanji cover 95% 2061 kanji cover 99% 2456 kanji cover 99.5% 3489 kanji cover 99.9%" I always suspected that the rough numbers of the newspaper list would be duplicated in other places (thought not necessarily with the exact same kanji), and that seems to confirm that guess. BTW where did you get the actual frequency list? Every link I could find in the thread was either broken, or to an unwieldy "google docs" spreadsheet. Is there just a text file I can copy into JWPce to compare it with the newspaper one? SRS sucks! - Transparent_Aluminium - 2009-10-18 Just go to this spreadsheet: http://spreadsheets.google.com/ccc?key=rvzyeuraL5bhNblVube-D_g&hl=en Google docs allows you to export it as a text file. Edit: Do you have the numbers from the newspaper list so that we could compare them? SRS sucks! - igordesu - 2009-10-18 Well, it seems clear that, despite differences in how many kanji different Japanese people "know," the Japanese form a very, very literate society overall. Also, as other people have mentioned, the Japanese don't use an "SRS" to maintain this ability. So, though I realize there is a difference between us and native speakers, is it fair to assume that we should be able to do the same thing w/o an SRS? I mean...I can look back on my own life and my experiences reading growing up, and I can clearly see the differences in reading/writing ability that I had at certain stages; from elementary to middle school, from middle school to high school, and from high school to now. Though I have certainly read A LOT over the years...I don't think it's an amount that is only attainable by native speakers. With enough interesting and varied reading material, couldn't we simulate or repeat that experience with Japanese to achieve the same level of success? SRS sucks! - mezbup - 2009-10-18 igordesu Wrote:Well, it seems clear that, despite differences in how many kanji different Japanese people "know," the Japanese form a very, very literate society overall. Also, as other people have mentioned, the Japanese don't use an "SRS" to maintain this ability.Bear in mind the difference being they've had kanji drilled into their heads their entire school life. In a way it sorta gives them a 1up on us but I suppose that after you've learned to read and write, enough reading will maintain your abilities without the need for an SRS really. SRS sucks! - Fillanzea - 2009-10-18 igordesu Wrote:With enough interesting and varied reading material, couldn't we simulate or repeat that experience with Japanese to achieve the same level of success?I would say a qualified yes to that, if we're primarily talking about acquiring vocabulary, and not the basic building blocks of grammar and pronunciation that do seem to be much easier to acquire for very young children. It's a qualified yes because I think it does take a huge volume of reading to support learning more obscure vocabulary words. The people I know who have really good vocabularies in English probably read at least 100,000 words a week on a regular basis, or went through an academic period when they were doing that much reading. I can't manage that in Japanese. And also because in Japanese, pronunciation isn't transparent from seeing the written language. There are so many words that I initially learned as "mumble-mumble" where I had some vague or even precise intuition of the meaning, but had no idea how to pronounce them. You get this kind of error in English (I used to think that bologna and baloney were different lunchmeats), but not nearly as often. SRS sucks! - igordesu - 2009-10-18 mezbup Wrote:I could see that. The only thing is, it's kind of difficult to define "learned to read and write," you know? I can go back and look at books that I read in middle school, and I realize that, although I certainly could read them and understand them, it would be dishonest to say that I "understood" or could "read" them to the same extent that I can read and understand lots of things today.igordesu Wrote:Well, it seems clear that, despite differences in how many kanji different Japanese people "know," the Japanese form a very, very literate society overall. Also, as other people have mentioned, the Japanese don't use an "SRS" to maintain this ability.Bear in mind the difference being they've had kanji drilled into their heads their entire school life. In a way it sorta gives them a 1up on us but I suppose that after you've learned to read and write, enough reading will maintain your abilities without the need for an SRS really. SRS sucks! - liosama - 2009-10-18 No he's trying to say "they must only know X at Y time because this is what it says in the curriculum" is a wrong statement. kids are bound to find countless new kanji in books they read where as others have mentioned will simply know what the word means by the sound of it and then within a read or two, will be able to write the character. So if anything 1000 or whatever is the expected number for a year 15yo student, I feel, is underestimated. Just how the Jouyou set underestimates how many kanji adults actually know which was what, 3000-6000? SRS sucks! - yudantaiteki - 2009-10-18 Quote:No he's trying to say "they must only know X at Y time because this is what it says in the curriculum" is a wrong statement.Japan is no different from the US in this regard. It's like the vocab/spelling tests through middle school in the US -- when the teacher gives a list of words, some kids will already know them all from their own reading, but there will be other kids who hardly know any of them (and won't learn them from the vocab test). Of course there will be some 15-year olds who know a large number of kanji beyond what they theoretically have been taught in school, but there will be some other kids who know less, because they sleep during class and don't go to juku or do any work, and don't read much. Quote:kids are bound to find countless new kanji in books they readIf they read, yes. liosama Wrote:Just how the Jouyou set underestimates how many kanji adults actually know which was what, 3000-6000?Where do you get this 3000-6000 number? SRS sucks! - ruiner - 2009-10-18 Anyone read this book? A History of Writing in Japan, by C. Seeley. On page 2 (introduction) he says modern non-specialist texts use 3000-3500 kanji: http://books.google.com/books?id=KCZ2ya6cg88C&lpg=PP1&dq=seeley%20japan&pg=PA2#v=onepage&q=&f=false Not sure what he bases that on, though surely it's in the notes someplace. ;p SRS sucks! - mezbup - 2009-10-18 I don't really get why people are arguing for such a high number as 3000 kanji for the average person. If that were the case then the average person could pass kanken 1.5 with little to no study but that just isn't the case. My good friend 茉莉恵ちゃん (まりえちゃん) said she studied pretty hard to pass level 2 which is 1000 kanji less than 1.5. 2 covers the approx. 2000 jouyou and 1.5 covers a total of about 3000. What gives guys? SRS sucks! - pm215 - 2009-10-18 ruiner Wrote:Anyone read this book? A History of Writing in Japan, by C. Seeley. On page 2 (introduction) he says modern non-specialist texts use 3000-3500 kanji:Yes, I have that book (it's pretty good if you're interested in the history of the writing system and aren't allergic to academic writing). p157 clarifies this a bit: Quote:Since family names and place names (and certain other proper nouns, eg company names) utilised a large number of characters that were otherwise little used, this point represented a partial obstacle to orthographic simplification, and is the major reason why the number of different Chinese characters employed in newspapers and magazines has never dropped below about 3200, even after the orthographic reforms of the late 1940s onwards.with a footnote giving a source: Quote:Concerning the numbers of different characters used in newspapers and magazines, see, for instance, Kokuritsu kokugo kenkyuujo, Gendai shinbun no kanji, p32...which implies that if you're prepared to ignore proper names (or hope for furigana) then the 3000-3500 figure is a red herring. On the subject of books, this language log post includes some interesting quotes from _Literacy and Script Reform in Occupation Japan_ on literacy rates (albeit rates in the 40s and 50s). I'll have to remember to check it out next time I'm passing the university library... SRS sucks! - Fillanzea - 2009-10-18 ruiner Wrote:Anyone read this book? A History of Writing in Japan, by C. Seeley. On page 2 (introduction) he says modern non-specialist texts use 3000-3500 kanji: http://books.google.com/books?id=KCZ2ya6cg88C&lpg=PP1&dq=seeley%20japan&pg=PA2#v=onepage&q=&f=falseNote that he says that about 1000 of those are used essentially only for proper names. And I can buy that newspapers and magazines might use so many proper names that you would need an extra 1000 kanji to write them, though that's far less true for literature. SRS sucks! - Transparent_Aluminium - 2009-10-18 mezbup: The kanken covers mores than just reading kanji. According to wikipedia: "The test examines ability to read and write kanji, to understand their meanings and use them correctly in sentences, and to identify correct stroke order". Therefore, someone could be only able to pass the Kanken level 2 but still be able to read or understand 3000 characters to a certain level. SRS sucks! - mezbup - 2009-10-18 Transparent_Aluminium Wrote:mezbup: The kanken covers mores than just reading kanji. According to wikipedia: "The test examines ability to read and write kanji, to understand their meanings and use them correctly in sentences, and to identify correct stroke order". Therefore, someone could be only able to pass the Kanken level 2 but still be able to read or understand 3000 characters to a certain level.My point exactly. Hence, reading 3000 and "knowing" 3000 are slightly different. The kanken tests if you have thorough knowledge of kanji rather than just superficial knowledge. Although it's definitely possible they can read somewhere in the 2500 - 3000 range. I'm guessing it may be closer to 2500 but numbers and guesses are total arbitrary and we all know this. I think we ought to define our terms a little clearer on what it means to "know" 3000 kanji. SRS sucks! - Fillanzea - 2009-10-18 I went googling around for data and hit upon this very interesting link: http://www.nier.go.jp/homepage/jouhou/system/rep10.html They ran textual analysis on a number of generalist nonfiction books. The number of individual kanji in each book ranged from 958 to 2206, with most falling around the 1400-1700 range (the average was 1472). The 40 books together used a total of 3,836 kanji. On average, each book used 1,288 Jouyou kanji, so 87.5% of the kanji used were Jouyou. However, that jumps up to an average of 98% if you count each individual instance of kanji usage. (e.g. book #3 had 37088 total instances of kanji, of which 36870 were Jouyou.) One book used 621 non-Jouyou kanji, but the average was 184.3. The major outlier in difficulty, made up of only 92% Jouyou kanji, was 故事成語, which means "Idioms derived from historical events or classical literature of China." Well, yes, I would expect that to have a ton of non-Jouyou kanji. My reading of the statistics is that they align very well with what yudan taiteki has been saying: you can read a great deal with about 1400-1700 kanji, and once you reach a baseline of somewhere around there, it's helpful for further learning to be driven by the fields you're actively studying or have an interest in. Because certainly "Idioms derived from Chinese literature" is going to have a different base of vocabulary than "The psychology of drinking alcohol." SRS sucks! - pm215 - 2009-10-18 pm215 Wrote:with a footnote giving a source:(Chasing references...) This turns out to be freely downloadable, although it is a 70MB PDF that has not been particularly cleanly scanned. Haven't actually read it yet :-) SRS sucks! - mezbup - 2009-10-18 Those are some really interesting figures. It's kinda painful thinking of the minuscule gain you get from an extra 1000 - 2000 kanji. Is there any information on kanji usage in works of fiction? SRS sucks! - yudantaiteki - 2009-10-18 mezbup Wrote:Those are some really interesting figures. It's kinda painful thinking of the minuscule gain you get from an extra 1000 - 2000 kanji.I've always found that encouraging rather than painful -- it means that you don't have to study those extra 1000-2000 kanji and you can spend that time doing something else. You already have to waste so much time wrestling with the writing system in Japanese, it's nice to be spared any extra effort. I think those statistics posted by Fillanzea are very interesting -- even in books using less than 1500 characters, close to 200 of them would be non-Jouyou. I think every experienced Japanese learner knows how out of touch the Jouyou list is with actual usage, but it's nice to see statistics to back that up. SRS sucks! - Fillanzea - 2009-10-18 I have been searching around for anything on works of fiction, and I've not found anything yet. (If anyone wants to try, 数量的分析 is a good keyword for 'frequency analysis'). My intuition is that you need fewer kanji for fiction but you'll see a short list of non-Jouyou kanji that authors really like to use, 誰 being the example that comes to mind right away. If I knew at all how to use mecab to generate frequency analysis stuff, it would be an interesting experiment, but I don't know where to get e-texts except for internet fanfiction and the public-domain texts at Aozora, and I don't think that either of those would apply very well to contemporary literature. SRS sucks! - yudantaiteki - 2009-10-18 Of course one thing that these frequency studies don't mention is which kanji have furigana in the texts in question. SRS sucks! - Fillanzea - 2009-10-18 Yeah. I'm reading 近代日本の小説 and it's certainly not a children's book but they have furigana on things like 忘却、批評家、嘆き、欧米圏、違和感、発掘、even 状況 and 刺激. I don't know why. Edit: It looks like this imprint of the publisher is specifically intended for less educated readers, and maybe younger ones, so that would explain it. Oh well, so much the better for me. SRS sucks! - TGWeaver - 2009-10-18 i think that the SRS is good for languages that you don't share an alphabet with. i don't think that i'd ever use one with a european language, but i do find that the SRS really helps with kanji. SRS sucks! - Transparent_Aluminium - 2009-10-18 Great find Fillanzea. Interesting data but I'm not sure what to make of this exactly. SRS sucks! - Nukemarine - 2009-10-19 Personally, I would think frequency lists gleemed from Dramanote scripts would be the best for putting together quick and dirty "Learn this First" material. Get Kanji frequency that accounts for 90% to 95% of what's in there, then organize that ala RTK Lite. Get Vocabulary frequency that also accounts for 90 to 95% and organize that via KO2001 list. Something tells me that what you'd get with the above would be very similar to what we already have with RTK Lite and KO2001 ie 1100 Kanji and about 3500 words. With 95% covered, you're in that area that one could be learning by immersion. In other words, what can get you away from the SRS faster. |