Back

Kanji Combinations Graph

#1
I made this out of curiosity.

--

Update: I added graphs by usage frequency. They should be more relevant to people studying for fluency.

Sub-graph of large "frequency by usage" graph:
[Image: graph.png]
Edited: 2016-12-02, 10:06 pm
Reply
#2
(2016-11-20, 8:56 pm)b_j_b Wrote: I made this out of curiosity.
http://bbj.x10host.com/kanji-graph/

Just putting it out there if it helps or entertains anyone.

Thanks.  I took a look at the graph.  I guess my question is, how can it be used to help learn kanji compounds?
Reply
#3
Reminds me of this:


Reply
(March 20-31) All Access Pass: 25% OFF Basic, Premium & Premium PLUS! 
Coupon: ALLACCESS2017
JapanesePod101
#4
How many kanji/words does this contain?
Reply
#5
somebody did something similar, but instead the connection criteria is the radical.

http://www.bibiko.de/KanjiSimNet
Reply
#6
b_j_b Wrote:Shows relationships between kanji of the most numerous compounds in the Japanese language.

What are those relationships?
Reply
#7
(2016-11-21, 9:40 am)RnBandCrunk Wrote: somebody did something similar, but instead the connection criteria is the radical.

http://www.bibiko.de/KanjiSimNet

I took a look...not sure what the purpose of this is though.

Can't you just look up kanji in a character dictionary to see what compounds it is in?
Reply
#8
(2016-11-21, 11:18 am)ファブリス Wrote:
b_j_b Wrote:Shows relationships between kanji of the most numerous compounds in the Japanese language.

What are those relationships?

I think what he means to say is that the graph links characters that appear together in many compounds.
There's either a link or there isn't, so it's just showing relationships that are above a certain threshold.

I'm not sure what value there would be to this graph, it's mostly just a random distribution of 2-character terms that are often used as prefixes or suffixes in big words.

The densely connected bits are somewhat more interesting, but I don't really know what meaning to take from that.

I feel like a similar graph for 'most common kanji compounds' would be more interesting, but I don't know how useful that would be either.... although it would be a nice reference for creating or solving crossword puzzles!
Reply
#9
(2016-11-20, 10:05 pm)phil321 Wrote:
(2016-11-20, 8:56 pm)b_j_b Wrote: I made this out of curiosity.
http://bbj.x10host.com/kanji-graph/

Just putting it out there if it helps or entertains anyone.

Thanks.  I took a look at the graph.  I guess my question is, how can it be used to help learn kanji compounds?
I'm using RTKs 1-3 and recommend that. I was just curious what the relationships when visualized would look like.

(2016-11-21, 6:42 am)Zarxrax Wrote: How many kanji/words does this contain?
Small has 69 kanji with 45 edges (connecting lines between kanji).
Large has 257 kanji and 211 edges.
See below for details.

(2016-11-21, 12:57 pm)SomeCallMeChris Wrote:
(2016-11-21, 11:18 am)ファブリス Wrote:
b_j_b Wrote:Shows relationships between kanji of the most numerous compounds in the Japanese language.

What are those relationships?

I think what he means to say is that the graph links characters that appear together in many compounds.
There's either a link or there isn't, so it's just showing relationships that are above a certain threshold.

I'm not sure what value there would be to this graph, it's mostly just a random distribution of 2-character terms that are often used as prefixes or suffixes in big words.

The densely connected bits are somewhat more interesting, but I don't really know what meaning to take from that.

I feel like a similar graph for 'most common kanji compounds' would be more interesting, but I don't know how useful that would be either.... although it would be a nice reference for creating or solving crossword puzzles!
Steps taken to generate nodes (kanji) and edges (relationships between kanji):
1. Download JMDict_e.gz from http://www.edrdg.org/jmdict/j_jmdict.html
2. Extract all <keb> entries that only have 常用漢字
3. Add 1 to the relationship's weight between all kanji in each <keb> entry
4. Only retain relationships with weight>=n (n=100 for large graph, n=50 for small graph)

In short, the weights and filtering are based on how many words appear in the dictionary file with each pair (or tuple) of kanji, not based on the frequency of a word in a Japanese corpus.

~

When I get the time, I'll redo the experiment based on word frequency, probably with data linked from http://ftp.monash.edu.au/pub/nihongo/00INDEX.html

Weighted and filtered by word frequency should be more relevant to people studying for fluency.
Edited: 2016-11-21, 5:36 pm
Reply
#10
(2016-11-21, 12:57 pm)SomeCallMeChris Wrote:
(2016-11-21, 11:18 am)ファブリス Wrote:
b_j_b Wrote:Shows relationships between kanji of the most numerous compounds in the Japanese language.

What are those relationships?

I think what he means to say is that the graph links characters that appear together in many compounds.
There's either a link or there isn't, so it's just showing relationships that are above a certain threshold.

I'm not sure what value there would be to this graph, it's mostly just a random distribution of 2-character terms that are often used as prefixes or suffixes in big words.

The densely connected bits are somewhat more interesting, but I don't really know what meaning to take from that.

I feel like a similar graph for 'most common kanji compounds' would be more interesting, but I don't know how useful that would be either.... although it would be a nice reference for creating or solving crossword puzzles!

I don't know about the theoretical use of that but it sure gives some kind of insight of where one would encounter this kanji. Showing such relationships is actually a "thing" used in linguistic studies I believe. Other languages have something similar as well, f.e. the german dictionary "Duden" shows example words with which the searched word is often seen.
http://www.duden.de/rechtschreibung/aufwendig scroll further down under "Typische Verbindungen".
Reply
#11
Update: see original post for graphs by usage frequency and a screenshot
Reply