![]() |
|
Kanji Frequency in Wikipedia - Printable Version +- kanji koohii FORUM (http://forum.koohii.com) +-- Forum: Learning Japanese (http://forum.koohii.com/forum-4.html) +--- Forum: General discussion (http://forum.koohii.com/forum-8.html) +--- Thread: Kanji Frequency in Wikipedia (/thread-3216.html) |
Kanji Frequency in Wikipedia - gombost - 2013-03-26 Sure, it's always good to put the bar slightly higher than comfortable. But the main point here is priority. It's much more useful to learn the first 1200 most frequent Japanese words than adding kanji from number 3000 to 4200 from a kanji frequency list to your SRS. Of course, it's fine if one of your long-term goals is to be able to use 4000+ kanji confidently, but reaching a level where you can enjoy native materials is much more closer than that so to me learning words seems more rewarding at this point. It's not a big detour though, so it certainly won't hurt you. Kanji Frequency in Wikipedia - Jiroukun - 2013-12-04 I know it may be too late and that @shang might not look at this thread anymore... But it's worth an ask: Would it be possible for you to provide the program you made for others to use? I would like to use the version [one?] you used to make the japanese mecab file located here: http://shang.kapsi.fi/kanji/jawp-mecab-words.csv Would it be possible for someone to use your program to find the frequency of words in another text? Like if I were to copy a board of 2ch and get all the text, would your program be able to parse out all the japanese words? Sorry for all the multiple questions, I just want to make sure my post is detailed enough so you know what I'm asking o.o Kanji Frequency in Wikipedia - lauri_ranta - 2013-12-05 @Jiroukun If you use OS X or Linux, you can use a shell command like this: for f in *.txt;do mecab -F'%t %f[6]\n' "$f"|awk '$1~/[267]/&&$2{print $2}';done|sort|uniq -c|sort %t is the type of the word or token, where 2 is a word with at least one kanji, 6 is hiragana, and 7 is katakana. %f[6] is the lemma of the word, or the sixth field in the default output. If you use OS X, you can install MeCab by running `brew install mecab mecab-ipadic` after installing Homebrew. Kanji Frequency in Wikipedia - Jiroukun - 2013-12-05 Thanks for the reply! I've used linux before, but I couldn't really get into it for whatever reason.. I'm on PC now... but I guess I could run a VM and try it that way! Thanks again xD |