Back

Mighty Morphin Morphology

#34
Hello, lurking for three years, first post, etc.

Boy.pockets Wrote:Another thing I thought might be cool (maybe you are already doing this): using the kanji readings to look for the next best word to use. For example, say you already know '出席「しゅっせき」', then an easy one to learn next might be '出廷「しゅってい」' (especially if you already know 'tei' from another word).
I've had this idea for years and finally got around to working on it. I've been doing this manually all along (favouring "knowable" words as much as I can), and can't imagine learning vocab any other way. I've tried, but prefer this method. Searching and adding words manually is very slow, so I am automating it!

overture2112 Wrote:Problem: given a kanji compound, how can you determine which parts of the reading are from which kanji?
That really is the core problem. I was surprised to find out nothing exists already to do this. I've started a project that uses readings of Kanji from KANJIDIC2 and tries to build a solution given a word's reading. I'll build a database of solutions for the entries in JMdict. I have the basics down, and it works well, but there are lots of other features of Japanese left to account for (like the example above, 出's reading is しゅつ, but the つ turns into a small っ in some compounds).

It's on github: https://github.com/ntsp/ryuujouji
ryuujouji = 粒状字 = granular characters, which I think describes it well. It's no where near complete or usable and has no documentation, but I'll get around to that in the next few days. Here is some sample output for your entertainment:

Code:
Solving: 小牛 == こうし
Solution # 0
小 -- こ  -- kanji
牛 -- うし  -- kanji

Solving: バス停 == バスてい
Solution # 0
バ -- バ  -- kana
ス -- ス  -- kana
停 -- てい  -- kanji

Solving: 非常事態 == ひじょうじたい
Solution # 0
非 -- ひ  -- kanji
常 -- じょう  -- kanji
事 -- じ  -- kanji
態 -- たい  -- kanji

Solving: 建て替える == たてかえる
Solution # 0
建て -- たて  -- kanji (た) with okurigana (て)
替える -- かえる  -- kanji (か) with okurigana (える)
Solution # 1
建て -- たて  -- kanji (た) with okurigana (て)
替え -- かえ  -- kanji (か) with okurigana (え)
る -- る  -- kana
Edited: 2011-05-21, 10:44 pm
Reply

Messages In This Thread