So there's a lot of tools out there for Japanese linguistics floating about, regarding quantitative text analysis, corpus linguistics, etc. It's a still-developing area, but I've been browsing around finding new toys and publications on the topic.
Japanese Wordnet - What's cool is the hypernyms, e.g. if you enter a word that's a hyponym (e.g. 白), clicking on its entry, in addition to Japanese + English results it will tell you its hyponym and hypernym (e.g. an achromatic colour, which is a hyponym of colour). The online version also has pictures (e.g. 車 has a tiny car icon). I haven't played with it much or at all offline, but I think there's potential there for its structure.
There's also KH Coder, a tool for content analysis, that has tonnes of features. Menus are in Japanese. I've been playing with the KWIC concordancing and its collocation sorting for word search results. There's also lots of stuff for graphing, statistics, parts-of-speech analysis, and sentence decomposition/parsing. Still playing with it. There's a lot of other programs out there for Japanese, but I couldn't find anything that worked well/at all.
Ya I recently found the Japanese Wordnet too. I had known about the Princeton one for English but hadn't realized till a few days ago (when I saw in the history log that Breen had included Wordnet into the WWWJDIC) that they had translated it entirely into Japanese (so to speak). Upon finding it, my original idea for a reverse dictionary look up tool for Japanese might finally be possible. I've always had an issue where sometimes I know part of a word I want and the general semantic meaning to the word but no way to look it up. So having a way to dump the partial fragment and then select a group of semantically similar words to the target and have it list candidates; would be awesome to say the least.
That's interesting. I've been reading about different ways of using semantic categories for word acquisition, though I've lost track of what papers interested me, but the idea of using Japanese WordNet to cluster words semantically struck me, with regards to hypernyms and such. Might refine it using some ideas from here: http://findarticles.com/p/articles/mi_7 … _n57103270
Meanwhile I'm still working out collocations in KH Coder and AntConc. The latter features regex and batch query support, so that could be fun. Edit: Ah! Clusters.
Edit: Oh, I see what you mean, re: Japanese WordNet at WWWJDIC. The JW links in search results. It's apparently used at Weblio also.
For the second bit, no, it's showing you the dependency relations of words/strings, but it does first use morphological analyzers like Mecab (e.g. Juman or Chasen for KNP and CaboCha, respectively): For a more detailed explanation of KNP: http://citeseerx.ist.psu.edu/viewdoc/do … p;type=pdf