Back

Kanji lists, Joyo, non-joyo etc. (Sticky topic)

#26
I limited my script to only search for those marked as "common" in EDICT, which, although flawed, is better than going through tens of thousands of compounds that are as useless as 川虫, for example.
Edited: 2012-08-04, 12:42 pm
Reply
#27
Is there anyway to get a list of all the Kanji that is in UNICODE for Japanese use? I just want to see it. I know that the Dai Kanwa Jiten has like 50,000+. I just want to see it all for myself without having to pay out the ass to get something in print. Does anyone know if scans of books or titles of books for studying for the Kanken 1? I would really appreciate it. Smile
Reply
#28
I could parse you a list (13,109 characters), but there are no readings or meanings attached.

You could also check http://homepage2.nifty.com/TAB01645/ohara/index.htm for fun.
Edited: 2012-08-04, 2:25 pm
Reply
See this thread for Holiday Countdown Deals (extended to Dec 26th)
JapanesePod101
#29
Or http://www.rikai.com/library/kanjitables...code.shtml
Reply
#30
Here you go: http://pastebin.com/PrBgV1mW

The link by kitakitsune includes many characters that are only found in China, Korea, etc., not Japan.
Edited: 2012-08-04, 2:47 pm
Reply
#31
If you browse through the Buddhist Taishō Shinshū Daizōkyō database at http://21dzk.l.u-tokyo.ac.jp/SAT/index_en.html (written in classical Chinese), you'll come across kanji that are not even encoded in Unicode yet. Perhaps you could make reading and understanding the entire Buddhist canon your goal in life, and thus complete your kanji odyssey. Smile
Reply
#32
Guys, please move discussion to new topics. The sticky thread is more like a wiki. If we have lots of comments here it won't be of much use to have a sticky. Comments will be cleaned up. See first post.
Reply
#33
ファブリス Wrote:Guys, please move discussion to new topics. The sticky thread is more like a wiki. If we have lots of comments here it won't be of much use to have a sticky. Comments will be cleaned up. See first post.
OK, I'll try reorganising this thread from next week. I haven't tried such a thing before, so might accidentally make a mess of it, hence the week's notice (i.e. save anything you want to keep just in case).

Note to anyone posting in this thread: In general, please post links to data rather than the data itself. Thank you.
Reply
#34
Sounds good! If you want to avoid messing up one very long post, I'd leave the contributions in their respective posts, perhaps with some trimming, so the original poster can continue to edit it. Feel free to delete my posts when you get to it Smile
Reply
#35
The Japanese government has a list (印刷標準字体) of defined standard character shapes for 1022 non-standard (表外漢字, 人名用漢字) characters (weird, I know).

http://internet.watch.impress.co.jp/www/...iohyo7.pdf

sorted:
http://internet.watch.impress.co.jp/www/.../sortl.pdf

spreadsheet:
http://internet.watch.impress.co.jp/www/...iohyo7.xls

Two missing kanji in the lists above are no. 740 (鄧) and no. 1004 (﨟)

html:
http://www.jca.apc.org/~earthian/aozora/...ougai.html

Hmm, seems some of the standard shapes are wrong in the above lists.

Here's the official list with the correct shapes, but not all kanji are searchable (images instead of text):

http://kokugo.bunka.go.jp/kokugo_nihongo...taihyo.pdf

The following 147 kanji have been added to the joyo set in 2010, but can still be found in the 印刷標準字体 list, because they haven't revised that yet:

挨, 宛, 闇, 椅, 畏, 萎, 茨, 咽, 淫, 臼, 餌, 怨, 岡, 臆, 俺, 苛, 牙, 崖, 蓋, 骸, 柿, 顎, 葛, 釜, 瓦, 韓, 玩, 畿, 僅, 巾, 串, 窟, 稽, 詣, 隙, 桁, 鍵, 舷, 股, 乞, 勾, 喉, 梗, 頃, 痕, 挫, 塞, 阪, 埼, 柵, 拶, 斬, 嫉, 腫, 呪, 蹴, 拭, 尻, 芯, 腎, 裾, 凄, 醒, 戚, 脊, 煎, 羨, 腺, 詮, 膳, 狙, 遡, 捉, 袖, 遜, 唾, 堆, 戴, 誰, 綻, 酎, 捗, 潰, 爪, 諦, 溺, 貼, 妬, 賭, 栃, 頓, 謎, 鍋, 匂, 捻, 罵, 箸, 斑, 氾, 汎, 膝, 肘, 阜, 蔽, 蔑, 蜂, 貌, 勃, 昧, 枕, 蜜, 冥, 餅, 妖, 沃, 侶, 賂, 弄, 麓, 脇, 丼, 傲, 刹, 哺, 喩, 嗅, 嘲, 毀, 彙, 恣, 惧, 慄, 拉, 摯, 曖, 鬱, 璧, 瘍, 箋, 籠, 緻, 羞, 訃, 諧, 貪, 踪, 辣

So really, it's a list of 875 non-standard character shapes.

Of those, 339 are jinmeiyou:

斡, 按, 庵, 鞍, 已, 夷, 葦, 謂, 溢, 鰯, 蔭, 迂, 烏, 云, 曳, 奄, 堰, 淵, 鳶, 燕, 凰, 鴨, 襖, 瓜, 珂, 迦, 嘩, 榎, 蝦, 臥, 俄, 峨, 駕, 芥, 廻, 恢, 晦, 堺, 蟹, 鎧, 樫, 筈, 萱, 函, 柑, 竿, 菅, 雁, 其, 祁, 箕, 窺, 徽, 祇, 掬, 汲, 灸, 笈, 厩, 鋸, 卿, 蕎, 饗, 禽, 喰, 寓, 戟, 頁, 訣, 蕨, 倦, 捲, 牽, 喧, 硯, 諺, 乎, 袴, 跨, 糊, 醐, 庚, 杭, 肴, 巷, 恰, 腔, 幌, 煌, 膏, 閤, 縞, 藁, 劫, 壕, 轟, 忽, 惚, 昏, 叉, 些, 蓑, 坐, 晒, 柴, 砦, 犀, 榊, 窄, 撒, 薩, 珊, 纂, 讃, 仔, 弛, 此, 砥, 斯, 獅, 而, 竺, 雫, 悉, 櫛, 柘, 這, 灼, 錫, 雀, 惹, 諏, 竪, 濡, 葺, 蒐, 輯, 鍬, 鷲, 廿, 粥, 閏, 楯, 馴, 杵, 汝, 哨, 秤, 湘, 摺, 裳, 鞘, 篠, 杖, 茸, 嘗, 埴, 燭, 賑, 壬, 訊, 錐, 栖, 棲, 甥, 貰, 錆, 蹟, 屑, 尖, 穿, 閃, 釧, 揃, 撰, 疏, 楚, 蘇, 宋, 湊, 槍, 漕, 噌, 叢, 粟, 噂, 樽, 鱒, 詫, 陀, 舵, 楕, 苔, 殆, 碓, 醍, 托, 凧, 坦, 耽, 湛, 歎, 灘, 馳, 筑, 紐, 厨, 註, 儲, 帖, 喋, 牒, 寵, 槌, 辻, 挺, 釘, 梯, 逞, 鼎, 綴, 鄭, 薙, 蹄, 鵜, 荻, 擢, 姪, 辿, 纏, 佃, 淀, 兎, 兜, 堵, 宕, 沓, 套, 桶, 萄, 逗, 樋, 橙, 櫂, 祷, 撞, 沌, 遁, 杷, 琶, 頗, 播, 芭, 煤, 柏, 箔, 莫, 曝, 畠, 絆, 幡, 挽, 磐, 蕃, 庇, 枇, 毘, 梶, 琵, 疋, 畢, 豹, 瓢, 廟, 瀕, 斧, 葡, 撫, 蕪, 吻, 焚, 瞥, 篇, 娩, 鞭, 圃, 蒲, 戊, 牡, 姥, 菩, 捧, 逢, 蓬, 鞄, 鋒, 牟, 卜, 俣, 沫, 迄, 蔓, 蒙, 勿, 籾, 尤, 釉, 楢, 輿, 傭, 螺, 蕾, 洛, 裡, 掠, 笠, 溜, 劉, 梁, 菱, 淋, 鱗, 煉, 漣, 憐, 簾, 魯, 櫓, 鷺, 狼, 肋, 窪, 隈, 或, 椀, 碗, 曾, 檜, 禰

Edit: On January 7th, 2015, 巫 was added to the jinmeiyou list, so 340 of those are jinmeiyou kanji.

There's also a subset called 簡易慣用字体 which are 22 simplified characters that can be used instead of the 印刷標準字体. The 22 kanji are (with standard form in brackets):

唖(啞), 頴(穎), 鴎(鷗), 撹(攪), 麹(麴), 鹸(鹼), 噛(嚙), 繍(繡), 蒋(蔣), 醤(醬), 曽(曾), 掻(搔), 痩(瘦), 祷(禱), 屏(屛), 并(幷), 桝(枡), 麺(麵), 沪(濾), 芦(蘆), 蝋(蠟), 弯(彎)

The following 3 were added to the joyo set in 2010, though, so it's a list of 19: 曽, 痩, 麺
Edited: 2015-07-30, 10:32 am
Reply
#36
I'm having trouble finding a list of RtK3 characters organized by ON yomi. There are a few links, but all of them seem to be dead...
Reply
#37
2013 GSF Jouyou Kanji by Con Kolivas
http://ck.kolivas.org/Japanese/kanji.html

Contains 2136 Jouyou kanji.
Also example words, but only kanji words, no okurigana words. So 食 しょく or 草食 そうしょく but not 食べる たべる or 食う くう for example.

By ‘radicals’ he means 236 ‘kanji elements/components’, not 214 classical 部首 ぶしゅ (bushu) classifiers (lit. section headers), also called radicals.

So, for example, 心 and 忄 are two seperate ‘radicals’ (心 こころ kokoro and 忄 立心偏 りっしんべん risshinben are the same bushu.)

Example:
心 こころ (kokoro) as in 愛, 悪, 泌
忄 立心偏 りっしんべん (risshinben) as in 悦, 憶


While listing his radicals (elements/components) he sometimes uses a kanji containing the element, not the element itself.

Examples:
Kanji: 泌 and its radicals: 丶 ノ 汁 心
(汁 instead of 氵 三水 sanzui 水 mizu)
or
忙 instead of忄 立心偏 りっしんべん (risshinben) as in 悦, 憶.
Reply
#38
Didn't know where to put it, so here goes:

Jouyou Kanji, 2136 characters
The author used media on the internet in 2013, in conjunction with the help of the Google [tm] search engine to develop a much more modern list. This list was last updated February 22nd 2013.
http://ck.kolivas.org/Japanese/sorted_freq_list.txt

Japanese Kanji Character Frequency Chart
The 1000 Chinese characters of most frequent appearance in the Asahi Shimbun with links to etymologies:
http://www.kanjinetworks.com/eng/kanji-d...-chart.cfm
Free Online Kanji Etymology Dictionary
http://www.kanjinetworks.com/eng/kanji-d...ionary.cfm
– Covering more than 6,500 Chinese characters as used in Japan –

Joyo Kanji (Jouyou Kanji, Grade 1-7, 2136 characters)
http://www.saiga-jp.com/language/kanji_list.html
and the dictionary with audio examples:
http://www.saiga-jp.com/kanji_dictionary.html
Reply
#39
Joyo kanji that have a Simplified Chinese counterpart with RTK (old) and RTK (new) numbers:

http://pastebin.com/FNat1BL3

Total: 1828
Same unicode glyph: 1320 (but not necessarily same shape, stroke order or stroke number)
Different unicode glyph: 508

For RTK:
http://pastebin.com/gMt5H35L

Total: 2363
Same unicode glyph: 1662 (but not necessarily same shape, stroke order or stroke number)
Different unicode glyph: 701

(both lists are limited to 3,500 "level one" hanzi only)

Here is for Traditional Chinese (used in Taiwan, limited to 4808 hanzi from 常用國字標準字體表 - The Table of Standard Commonly Used Chinese Characters):

http://pastebin.com/PLRagNSF

Total: 1881
Same unicode glyph: 1713 (but not necessarily same shape, stroke order or stroke number)
Different unicode glyph: 168

For RTK:
http://pastebin.com/5DxYpEqY

Total: 2542
Same unicode glyph: 2312 (but not necessarily same shape, stroke order or stroke number)
Different unicode glyph: 230
Edited: 2014-04-27, 1:04 pm
Reply
#40
buonaparte Wrote:Didn't know where to put it, so here goes:
Thanks! btw, Pauline is here. Welcome.
Reply
#41
(2012-07-12, 3:03 am)kazeatari Wrote: It doesn't seem this has been posted, so here is the official jouyou kanji list, with example words selected by Monbushou and notes. Those with ⇔ are particularly important (imho), 'cause they point out kun'yomi that have the same pronounce.
Ex.: 宛てる ⇔ 当てる、充てる

This is a link to the table
Here you can download the pdf

These links are dead now.
Edited: 2015-12-18, 3:56 pm
Reply
#42
(2015-12-18, 3:56 pm)fkb9g Wrote:
(2012-07-12, 3:03 am)kazeatari Wrote: It doesn't seem this has been posted, so here is the official jouyou kanji list, with example words selected by Monbushou and notes. Those with ⇔ are particularly important (imho), 'cause they point out kun'yomi that have the same pronounce.
Ex.: 宛てる ⇔ 当てる、充てる

This is a link to the table
Here you can download the pdf

These links are dead now.
文化庁 | 国語施策・日本語教育 | 国語施策情報 | 常用漢字表の音訓索引
http://kokugo.bunka.go.jp/kokugo_nihongo...index.html

常用漢字表
http://kokugo.bunka.go.jp/kokugo_nihongo...101130.pdf
Reply
#43
Hi, I'm trying to see this post, related to RTK errata by Woelpad, as listed in Katsuo's post. But the forum says that I'm not logged in or have not done email confirmation, etc. I don't think that's the case, as the forum did not request from me such confirmation, and I'm logged in.

Is there another way I can see this errata post/list or any way I can access the post?

http://forum.koohii.com/showthread.php?p...7#pid19257
Reply
#44
@mcabel It should work now.
Reply
#45
i like this list of 2,500 most frequently used kanji in newspapers

http://tangorin.com/common_kanji
Reply
#46
(2010-11-24, 6:52 am)danaduck Wrote: ko1 extra primitives (only using data from rtk1):
吾 呂 旦 舌 升 丸 寸 占 貝 頁 句 勺 首 乙 刀 刃 貫 肖 泉 炎 里 朱 犬 介 王 玉 軍 周 士 吉 是 衣 匕 比 皆 旨 曽 虫 己 亀 羊 羽 固 忍 志 我 戒 刑 史 吏 又 爪 至 谷 皮 夫 竹 付 任 丙 勿 尺 戸 甲 斤 両 斗 廿 矢 弓 弘 与 老 孝 官 穴 糸 幾 玄 系 卸 厄 酉 豆 皿 即 辛 幸 害 垂 斉 央 甘 洪 亜 舟 鳥 免 馬 且

Total: 100 Kanji (ko2:33, ko3:58, other:9)

I think 矛 halberd should be included as a ko group 1 primitive for 予 beforehand. See RTK 1 (6th ed) frames 1311 and 1719.

Also 舞 dance is a ko group 1 primitive for 無 nothingness (as it introduces oaken tub). See RTK 1 (6th ed) frames 1912 and 1913.
Edited: 2016-03-17, 6:10 pm
Reply
#47
I made a current list of the 漢検 (Kanji Kentei) characters with English keywords from RTK. It's a text file of tab-separated values that you can copy-paste into a spreadsheet application. Here's the download link.

Note that the 準1級 and 1級 levels cover characters that are outside the JIS X 0208 character set.
Reply
#48
(2017-03-16, 10:38 am)fkb9g Wrote: I made a current list of the 漢検 (Kanji Kentei) characters with English keywords from RTK. It's a text file of tab-separated values that you can copy-paste into a spreadsheet application. Here's the download link.

Note that the 準1級 and 1級 levels cover characters that are outside the JIS X 0208 character set.

I've downloaded your list and took a brief look at it. I don't understand what the coverage of your list is.

* Does it comprise all of the RTK1+3 kanjis that are in one of the kentei levels? No, it can't, because then the number of kanjis would have to be no more than the number of kanjis in RTK1+3, namely 3000 kanjis; however your list contains 5640 kanjis.

* Does it comprise all of the kentei kanjis? No, it can't, because the number of kentei kanjis is 6355 whereas your list contains only 5640 kanjis.

* Does it comprise all of the kentei kanjis excluding level 1? No, it can't, because the number of kentei kanjis excluding level 1 is 2965 whereas your list contains 5640 kanjis.

So what does your list cover?
Edited: 2017-03-16, 11:13 am
Reply
#49
(2017-03-16, 11:12 am)ItaiB Wrote: * Does it comprise all of the kentei kanjis? No, it can't, because the number of kentei kanjis is 6355 whereas your list contains only 5640 kanjis.

* Does it comprise all of the kentei kanjis excluding level 1? No, it can't, because the number of kentei kanjis excluding level 1 is 2965 whereas your list contains 5640 kanjis.

I created my list from official 漢検 publications (with the 旧字体 characters intentionally left out).

What's the source for your numbers (5640 and 2965)?
Edited: 2017-03-16, 11:37 am
Reply
#50
(2017-03-16, 11:36 am)fkb9g Wrote:
(2017-03-16, 11:12 am)ItaiB Wrote: What's the source for your numbers (5640 and 2965)?

The English Wikipedia article on Kanji kentei. The counterpart Japanese Wikipedia article has different numbers (namely 2994 for level pre-1 and "approx. 6000" for level 1), but they still don't coincides with your list.
Reply