![]() |
|
Kanji compounds/words per Lesson of Book 1 ?? - Printable Version +- kanji koohii FORUM (http://forum.koohii.com) +-- Forum: Learning Japanese (http://forum.koohii.com/forum-4.html) +--- Forum: General discussion (http://forum.koohii.com/forum-8.html) +--- Thread: Kanji compounds/words per Lesson of Book 1 ?? (/thread-10839.html) Pages:
1
2
|
Kanji compounds/words per Lesson of Book 1 ?? - choubatsu - 2013-05-29 Hi everyone, I'm wondering if anyone knows of any lists which contain kanji compounds or words listed per Lesson of RTK1? I learned Japanese through a totally different system and my reading ability is advanced. I know all the readings of the kanji and how to read thousands of compounds already. But my writing skills are weak. I am going through RTK1 now, eventhough I know it's not recommended to do it backwards. I'd like to practice by writing words I already know and can read (but can't write from memory). If I do this lesson by lesson I'm hoping to fill in the gaps. For example, from Lessons 1 and 2, words like 明日, 早く, 古本, stuff like that. Or, does anyone know of a program or site where you can input a list of selected kanji (ie Lessons 1-5 or RTK1) and have it produce a list of words or compounds containing only those kanji?? Thanks in advance and I hope to be a regular contributor on the forum. Kanji compounds/words per Lesson of Book 1 ?? - EratiK - 2013-05-29 If you go to the Reviewing the Kanji main site, the vocab shuffle in the Labs tab just does that. One condition though, it has to be a learned kanji (three or more succesful reviews). Kanji compounds/words per Lesson of Book 1 ?? - ktcgx - 2013-05-29 I think that's it's better to do it the 'pure' method, and then just add the readings later... You're only really needing the writing, because you already can recognise and read a lot of kanji, right? So don't clutter your study with things you already know.^^ Kanji compounds/words per Lesson of Book 1 ?? - choubatsu - 2013-05-30 Well it's not really things I already 'know', or at least know fully. I could easily read a word like 懲罰 for example but there's no way I could write it from memory. If I just through the pure method, I think I'll still be at a loss as to which kanji to use for the words I already know. That's why I wanted to start practicing writing as I go. But I'm not actually sure what the best way to move forward is. Kanji compounds/words per Lesson of Book 1 ?? - choubatsu - 2013-05-30 ktcgx Wrote:I think that's it's better to do it the 'pure' method, and then just add the readings later... You're only really needing the writing, because you already can recognise and read a lot of kanji, right? So don't clutter your study with things you already know.^^I see. Is there a downloadable list? Or do you have to do it through the site? I use Anki. Kanji compounds/words per Lesson of Book 1 ?? - lauri_ranta - 2013-05-30 If you use OS X, you can paste this to Terminal: for x in edict_sub kanjidic;do curl ftp://ftp.monash.edu.au/pub/nihongo/$x|iconv -f euc-jp -t utf-8>$x.txt;done;sed -En 's/^([^ ]+).* L([0-9]+) .*/\2 \1/p' kanjidic.txt|sort -n|sed -n 1,94p|cut -d' ' -f2|tr -d '\n'|ruby -KUe 'kanji=STDIN.read;puts IO.read("edict_sub.txt").scan(/^([#{kanji}]{2}) \[(.*?)\] \/(?:\(.*?\) )*(.*?)\//).map{|l|l.join("\t")}' It also worked on Ubuntu when I ran sudo apt-get install curl ruby1.9.1 and changed sed -E to sed -r. sed -n 1,94p selects RTK frame numbers 1 to 94. KANJIDIC still uses fifth edition frame numbers. [#{kanji}]{2} only matches two kanji compounds. Change it to [#{kanji}\343\201\201-\343\202\237]{2,} to include words with hiragana. Example output (where I replaced tabs with two spaces): 一日 ついたち first day of the month 目下 めした subordinate 明朝 みょうちょう tomorrow morning 一元 いちげん unitary Kanji compounds/words per Lesson of Book 1 ?? - buonaparte - 2013-05-30 As far as I understand your problem, I may by wrong of course, you don't seem to know 1. kanji stroke order rules - they are very simple 2. bushu (radicals) and their Japanese names. You can find all the necessary info here: http://users.bestweb.net/%7Esiom/martian_mountain/K.7z The file includes two fonts as well: stroke orders and calligraphic font. There are html files there with all new jouyou kanji list with example words. One of the lists is in Heisig order. You may find this offline dictionary useful too. http://zkanji.sourceforge.net/ Kanji compounds/words per Lesson of Book 1 ?? - choubatsu - 2013-05-30 buonaparte Wrote:As far as I understand your problem, I may by wrong of course, you don't seem to knowI do know stroke order rules but not the names of the radicals. But I'm more interested in being able to write all the words I already know how to read, if that makes sense. So I'm going through RTK1 and using the stories to write each individual kanji. But I'd like to supplement that with practicing writing words I know that use kanji from the Lessons of RTK1 I've do so far. Kanji compounds/words per Lesson of Book 1 ?? - choubatsu - 2013-05-30 lauri_ranta Wrote:If you have access to a Unix shell:I appreciate your time in answering my question, but I'm afraid my technical skills aren't up to that task. I don't have access to a Unix shell (to be honest I don't really know what it is). But I will continue to look for ways to generate words from a given list of kanji. Kanji compounds/words per Lesson of Book 1 ?? - choubatsu - 2013-05-30 ktcgx Wrote:I think that's it's better to do it the 'pure' method, and then just add the readings later... You're only really needing the writing, because you already can recognise and read a lot of kanji, right? So don't clutter your study with things you already know.^^Doing that now and it looks good. Thanks. What I'm really looking for is a list is that data source that generates vocab shuffle tool uses to generate it's words. Kanji compounds/words per Lesson of Book 1 ?? - dizmox - 2013-05-30 If you're familiar with most radicals and understand general stroke order rules I think you can just safely just reverse cards in your vocab deck, starting with the simplest (so they go hiragana -> kanji). Indeed, if you can, you should, since practicing to write compounds will be more useful. At least this is what I did when I forgot most of RTK a year or two ago. Just redownloaded core 6000, and reversed about 1500 cards. Friend did a similar thing. Kanji compounds/words per Lesson of Book 1 ?? - ktcgx - 2013-05-30 choubatsu Wrote:There are downloadable, public RTK1 decks on anki, but personally, I prefer this site, as I find it much easier to use, plus, there're all the stories published here to help you remember how to write the characters...ktcgx Wrote:I think that's it's better to do it the 'pure' method, and then just add the readings later... You're only really needing the writing, because you already can recognise and read a lot of kanji, right? So don't clutter your study with things you already know.^^I see. Is there a downloadable list? Or do you have to do it through the site? I use Anki. Kanji compounds/words per Lesson of Book 1 ?? - Inny Jan - 2013-05-30 @lauri_ranta Cleverly done ![]() @choubatsu If your machine is Mac you do have access to Unix command line - this is where lauri_ranta's command can be executed. Below is my attempt to make his script a bit more accessible (# [nb] is a comment and as such has no significance on execution of the command): Code: # [1]Download publicly available kanji dictionaries: Edict and Kanjidic - save them to edict_sub.txt and kanjidic.txt files [2] From all lines in kanjidic.txt select a kanji and its frame number - print them as (number, kanji) [3] Sort the output of [2] - this sorts the kanji according to the frame number [4] From the output of [3] select lines from 1 to 94, ie. the first 94 frames [5] From the output of [4] strip out the first column (the frame number) [6] From the output of [5] remove linebreaks - all the selected kanji are concatenated on one line now [7] Store the output of [6] to a variable 'kanji' [8] Read in edict_sub.txt [9] In the string created at [8] select complete entries that contain two kanji compounds, the characters for the compounds are only those that are stored in the 'kanji' variable [10] The output of [9] is (compound, reading, meaning) each on a separate line - join those with a tab character. This is our final output. Kanji compounds/words per Lesson of Book 1 ?? - choubatsu - 2013-06-09 Sorry I'm a PC user. Kanji compounds/words per Lesson of Book 1 ?? - ファブリス - 2013-06-09 EratiK Wrote:If you go to the Reviewing the Kanji main site, the vocab shuffle in the Labs tab just does that. One condition though, it has to be a learned kanji (three or more succesful reviews).In addition to what EratiK mentioned, you can also do so without adding flashcards on the Labs page by entering a Heisig index. You will get a random selection of kanji compounds made of characters learned up to the given Heisig frame number. You don't get to choose a specific range of characters but you can speed through the ones you've already seen with the space bar. Kanji compounds/words per Lesson of Book 1 ?? - choubatsu - 2014-10-29 Sorry to re-open this discussion after so long but I'm still looking for answers. How does the Language lab generate words just from an input of a selection of characters? In other words, if I tell it I've studied up to Frame 750, how does it then generate a list of words which contain only Kanji from those frames? * Edit * Also, what database of vocabulary is it searching for matching Kanji? Kanji compounds/words per Lesson of Book 1 ?? - choubatsu - 2014-10-29 Inny Jan Wrote:@lauri_rantaDo you know if this is possible on a PC somehow? Kanji compounds/words per Lesson of Book 1 ?? - yogert909 - 2014-10-29 I believe this is what you are looking for, but I'm a mac user so I haven't used it. https://www.cygwin.com/ Kanji compounds/words per Lesson of Book 1 ?? - john555 - 2014-10-29 Why not try this. Get a couple of books like in the links below. Look up the kanji from Heisig for which you want compounds and write down all the compounds where the compounds include only the ones up to the point in Heisig you desire. (We're not all computer programmers on this forum). http://www.amazon.com/Essential-Kanji-Characters-Systematically-Reference/dp/0834802228/ref=sr_1_1?s=books&ie=UTF8&qid=1414626642&sr=1-1&keywords=essential+japanese+characters http://www.amazon.com/Guide-Remembering-Japanese-Characters/dp/0804820384/ref=sr_1_1?s=books&ie=UTF8&qid=1414626667&sr=1-1&keywords=remembering+japanese+characters http://www.amazon.com/Guide-Reading-Writing-Japanese-Characters/dp/4805311738/ref=sr_1_1?s=books&ie=UTF8&qid=1414626688&sr=1-1&keywords=florence+sakade Kanji compounds/words per Lesson of Book 1 ?? - yogert909 - 2014-10-29 I just ran it and pasted the output below. Apparently it only goes up to frame 94. Email me if you like this and I'll send you a longer list so as not to fill up this thread with thousands of lines of text. 一員 いちいん person 一丸 いちがん lump 一月 いちがつ January 一見 いっけん look 一元 いちげん unitary 一口 ひとくち mouthful 一首 いっしゅ tanka 一寸 ちょっと just a minute 一世 いっせい generation 一切 いっさい all 一千 いっせん 1,000 一旦 いったん once 一丁 いっちょう one sheet 一二 いちに the first and second 一日 いちにち first day of the month 一日 ついたち first day of the month 一品 いっぴん item 一目 ひとめ glance 凹凸 おうとつ unevenness 下見 したみ preview 下旬 げじゅん month (last third of) 下町 したまち low-lying part of a city (usu. containing shops, factories, etc.) 下品 げひん vulgarity 九九 くく multiplication table 九月 くがつ September 九十 きゅうじゅう ninety 九日 ここのか the ninth day of the month 月見 つきみ viewing the moon 月日 がっぴ date 月日 つきひ time 元首 げんしゅ ruler 元旦 がんたん New Year's Day 元日 がんじつ New Year's Day 五月 ごがつ May 五十 ごじゅう fifty 五日 いつか the fifth day of the month 口上 こうじょう vocal message 工員 こういん factory worker 項目 こうもく item 左右 さゆう left and right 三月 さんがつ March 三十 さんじゅう thirty 三千 さんぜん 3000 三日 みっか the third day of the month 三百 さんびゃく 300 四月 しがつ April 四十 よんじゅう forty 四千 よんせん four thousand 四日 よっか fourth day of month 四百 よんひゃく four hundred 自首 じしゅ surrender 自白 じはく confession 自負 じふ conceit 自明 じめい obvious 七月 しちがつ July 七十 しちじゅう seventy 七日 なのか the seventh day of the month 首唱 しゅしょう advocacy 十一 じゅういち 11 十九 じゅうきゅう 19 十月 じゅうがつ October 十五 じゅうご 15 十三 じゅうさん 13 十四 じゅうし 14 十七 じゅうしち 17 十二 じゅうに 12 十日 とおか the tenth day of the month 十八 じゅうはち 18 十万 じゅうまん 100,000 十六 じゅうろく 16 上下 うえした top and bottom 上下 じょうげ top and bottom 上旬 じょうじゅん first 10 days of month 上昇 じょうしょう rising 上田 じょうでん high rice field 上品 じょうひん elegant 真上 まうえ just above 占有 せんゆう exclusive possession 早口 はやくち fast-talking 早朝 そうちょう early morning 卓上 たくじょう desktop 中元 ちゅうげん 15th day of the 7th lunar month 中古 ちゅうこ used 中旬 ちゅうじゅん middle of a month 中世 ちゅうせい Middle Ages (in Japan esp. the Kamakura and Muromachi periods ) 中中 なかなか very 中日 ちゅうにち China and Japan 丁目 ちょうめ district of a town 朝日 あさひ morning sun 町中 まちなか downtown 頂上 ちょうじょう top 直下 ちょっか directly under 的中 てきちゅう striking home 凸凹 でこぼこ unevenness 二月 にがつ February 二見 ふたみ forked (road, river) 二十 にじゅう twenty 二世 にせい nisei 二日 ふつか second day of the month 二百 にひゃく two hundred 日中 にっちゅう daytime 日日 ひにち the number of days 八月 はちがつ August 八十 はちじゅう eighty 八丁 はっちょう skillfulness 八日 ようか the eighth day of the month 百万 ひゃくまん 1,000,000 品目 ひんもく item 万一 まんいち emergency 明朝 みょうちょう tomorrow morning 明日 あした tomorrow 明日 あす tomorrow 明日 みょうにち tomorrow 明白 めいはく obvious 目下 めした subordinate 目下 もっか at present 目上 めうえ superior 目的 もくてき purpose 目白 めじろ white-eye family of birds (Zosteropidae) 六月 ろくがつ June 六十 ろくじゅう sixty 六日 むいか sixth day of the month Kanji compounds/words per Lesson of Book 1 ?? - aldebrn - 2014-10-29 yogert909 Wrote:I just ran it and pasted the output below. Apparently it only goes up to frame 94. Kanji compounds/words per Lesson of Book 1 ?? - yogert909 - 2014-10-29 Hey Aldebrn, thanks. I figured that out but I just didn't want to fill the thread with 1000s of lines of text. I know just enough about programming to make small edits to existing scripts, but not enough to write anything novel unfortunately. I'd love to be able to come up with the kinds of things you write, but I don't have the time to learn unfortunately. Btw, is fuzzy-anki still working? I've tried it recently and it didn't seem to be working like it did the first few times. Kanji compounds/words per Lesson of Book 1 ?? - aldebrn - 2014-10-30 @choubatsu, take a look at https://gist.github.com/fasiha/42026df0e1a5c5d063ec#file-choubatsu-txt (and documentation above it). It's a list of all 3000-odd kanji from RTK volumes 1 and 3, and for each kanji, there's a list of "compounds" in Edict (a popular but now-deprecated? open-source online J-E dictionary) that are built solely out of preceding kanji. Is this something like what you're looking for? I know it's not grouped into lessons, and it doesn't have the nice accompanying definitions/readings, but these can be easily fixed. One caveat is that when I say "compound", I mean basically any string of consecutive kanji. So there are numerous "compounds" which are actually concatenations of individual compounds, e.g., 朝鮮民主主義人民共和国 and 北海道開発庁長官 (which MeCab-wakati parses as 北海道開発庁 長官) and 国際協力事業団 (with spaces: 国際 協力 事業 団) & ... sorry. The code to build this is in JavaScript right now, but unlike the shell/ruby script by lauri_ranta, it's much slower (since it's doing a lot more work; this list took ~20 minutes to generate). Based on feedback, I can make it faster & extend it to be more flexible so anyone can easily build lists like this given (1) some kanji and (2) a corpus of text. (Especially now that MeCab runs in Javascript, we can have all kinds of linguistic parsing fun in the browser.) Edit: of all 14'000 "compounds" in Edict (not just those using the 2200 RTK1 kanji), here's a breakdown according to their length: 1 kanji: 1851 "compounds" 2 kanji: 10725 "compounds" 3 kanji: 1596 "compounds" 4 kanji: 361 "compounds" 5 kanji: 53 "compounds" 6 kanji: 11 "compounds" 7 kanji: 8 "compounds" 8 kanji: 1 "compounds" 11 kanji: 1 "compound" yogert909 Wrote:Hey Aldebrn, thanks. I figured that out but I just didn't want to fill the thread with 1000s of lines of text. I know just enough about programming to make small edits to existing scripts, but not enough to write anything novel unfortunately. I'd love to be able to come up with the kinds of things you write, but I don't have the time to learn unfortunately.(1) I think just replacing 94 with 9400 does something somewhat bad: it reorganizes the order of the printout so that 一 is no longer the first line printed: the 一 block of words at the top of your list there is some 200-lines in. I think this might be due to Ruby's regexp implementation? (2) fuzzy-anki should still be working, I used its "review history" mode (versus "deck browse mode") a couple of days ago. It's terribly confusing and user-unfriendly, so let me know if something specific isn't working and I can try to simplify that (github issues: https://github.com/fasiha/fuzzy-anki/issues or email). I plan on improving that tool, but time spent coding is time not spent on Japanese ![]() (3) you come up with cool ideas that the coders can then go implement, so good on you! Kanji compounds/words per Lesson of Book 1 ?? - choubatsu - 2014-10-31 aldebrn Wrote:so anyone can easily build lists like this given (1) some kanji and (2) a corpus of text. (Especially now that MeCab runs in Javascript, we can have all kinds of linguistic parsing fun in the browser.)That's basically what I'm looking for. A way to enter in a certain number of Kanji and extract from a corpus only the compounds which contain those Kanji. I'd love to learn how to do it myself but as I mentioned, I have zero experience with things like mecab. Do you have any advice on how to get started using programs like mecab? Or to get started learning how to do basic code for linguistic problems? BTW, I really appreciate everyone's help and the great suggestions I've seen on this thread. 誠に有難う! Kanji compounds/words per Lesson of Book 1 ?? - aldebrn - 2014-10-31 choubatsu Wrote:What did you think of this this list I generated using a list of 2200 kanji from RTK1 and Edict as a corpus: https://gist.github.com/fasiha/42026df0e1a5c5d063ec#file-choubatsu-txt ?aldebrn Wrote:so anyone can easily build lists like this given (1) some kanji and (2) a corpus of text. (Especially now that MeCab runs in Javascript, we can have all kinds of linguistic parsing fun in the browser.)That's basically what I'm looking for. A way to enter in a certain number of Kanji and extract from a corpus only the compounds which contain those Kanji. (It'll be a snap to make the code accept user input, so you could paste in your own kanji list & corpora, but I was looking for feedback on the output format.) choubatsu Wrote:I'd love to learn how to do it myself but as I mentioned, I have zero experience with things like mecab. Do you have any advice on how to get started using programs like mecab? Or to get started learning how to do basic code for linguistic problems?I'm the wrong person to ask for how to get started since I'm learning all this as I go along Jeff Berhow (a fellow Koohito) has a YouTube tutorial on installing MeCab on Windows and if you don't want to go through all that pain, I compiled MeCab to Javascript so you could play with it in your browser (click "Examples" there to see what options you can use). MeCab is a total mystery to me: all the documentation is in Japanese so I have no idea what it is actually capable of---I just see other people using it to make amazing things (like Anki's Japanese Support plugin which adds furigana to kanji-only text, or the wakati mode that adds spaces to Japanese) and reverse-engineer how they did it.But specific problems like this, I can show you how I did it and answer questions: I generated your list of compounds using this software: https://github.com/fasiha/compounds-per-kanji/blob/master/code.js It's about as stupid an implementation as you can imagine, which is why it's terrifically slow. (I am pretty sure it can be sped up using the technique @lauri_ranta demonstrated...) |