Back

looking for an ordered onyomi index

#1
Hi guys,

disclaimer: I did my best to search the forum (and the net..) before starting a new thread, but all I could find were dead links to lists that are not exactly what I am looking for.

I'm looking for an index that lists kanji according to onyomi, and ideally subgroups kanjis of a particular onyomi reading according to radicals/primitives. so, for example, if I was to search for the reading リン I would get:
輪倫侖棆淪綸
隣鱗燐麟
林淋琳
etc.

Please note that kanji should be recognized to have more than one onyomi, i.g., the kanji 強 would belong to both the キョウ list and the ゴウ list.

Is there such an index in existence?
Thanks!
Reply
#2
This should be easy to make, but how exactly do you want to group "kanjis of a particular onyomi reading according to radicals/primitives"? RTK ordering is easy, anything else will require a specification on database to use.
Reply
#3
Grouped by phonetic primitive, sorted by how good the primitive is at predicting on'yomi (I don't remember the exact details of the sorting): http://forum.koohii.com/showthread.php?p...#pid195510
Reply
May 16 - 30 : Pretty Big Deal: Save 31% on all Premium Subscriptions! - Sign up here
JapanesePod101
#4
Vempele Wrote:Grouped by phonetic primitive, sorted by how good the primitive is at predicting on'yomi (I don't remember the exact details of the sorting): http://forum.koohii.com/showthread.php?p...#pid195510
Oh my, this is truly frightening in its comprehensiveness.
Thanks a lot!
Reply
#5
I wrote a ruby script that uses data from kanjidic and http://cjkdecomp.codeplex.com:

require "nokogiri"

yomi = Hash.new { |x, y| x[y] = [] }
decomp = {}

IO.readlines("cjk-decomp-0.4.0.txt").each { |l|
l =~ /(.+):.+\((.*)\)/
decomp[$1] = $2.split(",")
}

Nokogiri.XML(IO.read("kanjidic2.xml")).css("character").each { |e|
next if e.css("dic_ref[dr_type='heisig']").empty?
kanji = e.css("literal").text
e.css("reading[r_type='ja_on']").each { |x| yomi[x.text] << kanji }
}

yomi.each { |y, kanji|
allcomponents = kanji.map { |x| decomp[x] }.flatten.compact
allcomponents.select { |x| allcomponents.count(x) >= 2 }.uniq.each { |x|
puts y + " " + kanji.select { |k| decomp[k] && decomp[k].include?(x) }.join
}
}

I uploaded the output to http://19a5b0.s3-website-us-west-2.amazo...y-yomi.txt.

The list is limited to RTK kanji, but it includes all yomi in kanjidic and not just jouyou yomi. I didn't try to remove groups of kanji like 綾綸 that share non-phonetic compounds.

Kanji for リン:

リン 倫輪綸
リン 淋琳
リン 燐隣鱗麟
リン 綾綸
Reply
#6
Thanks Lauri - that's exactly what I was looking for!
Reply
#7
lauri_ranta Wrote:I wrote a ruby script that uses data from kanjidic and http://cjkdecomp.codeplex.com
Oh snap, I've never seen this decomposition dataset before! It looks pretty good, although I did notice a flaw in 留:d(卯,田). Do you have any experience that could let you compare CJKDecomp against CJKVI or KanjiVG, two venerable decomposition databases?
Reply