So there's a lot of tools out there for Japanese linguistics floating about, regarding quantitative text analysis, corpus linguistics, etc. It's a still-developing area, but I've been browsing around finding new toys and publications on the topic.
For publications, check out:
Japanese word sketches: towards a new version
A large public-access corpus for Japanese
A web corpus and word sketches for Japanese
For toys we have:
Japanese Wordnet - What's cool is the hypernyms, e.g. if you enter a word that's a hyponym (e.g. 白), clicking on its entry, in addition to Japanese + English results it will tell you its hyponym and hypernym (e.g. an achromatic colour, which is a hyponym of colour). The online version also has pictures (e.g. 車 has a tiny car icon). I haven't played with it much or at all offline, but I think there's potential there for its structure.
There's also KH Coder, a tool for content analysis, that has tonnes of features. Menus are in Japanese. I've been playing with the KWIC concordancing and its collocation sorting for word search results. There's also lots of stuff for graphing, statistics, parts-of-speech analysis, and sentence decomposition/parsing. Still playing with it. There's a lot of other programs out there for Japanese, but I couldn't find anything that worked well/at all.
For an online but more limited concordancer using tools/data provided by the authors of the above publication links: http://www.someya-net.com/concordancer/index_j.html
There's also a lot of links via the KH Coder page, here:
http://khc.sourceforge.net/
Or rather: http://khc.sourceforge.net/link.html
Edit: Oh! Just as I wrote that I couldn't get others to work (re: Free Japanese collocations), I got Laurence Anthony's AntConc to work: http://www.antlab.sci.waseda.ac.jp/software.html
Thanks to Thora for that last. I tried it before and couldn't get it work, but now it's fine (user error corrected).
For publications, check out:
Japanese word sketches: towards a new version
A large public-access corpus for Japanese
A web corpus and word sketches for Japanese
For toys we have:
Japanese Wordnet - What's cool is the hypernyms, e.g. if you enter a word that's a hyponym (e.g. 白), clicking on its entry, in addition to Japanese + English results it will tell you its hyponym and hypernym (e.g. an achromatic colour, which is a hyponym of colour). The online version also has pictures (e.g. 車 has a tiny car icon). I haven't played with it much or at all offline, but I think there's potential there for its structure.
There's also KH Coder, a tool for content analysis, that has tonnes of features. Menus are in Japanese. I've been playing with the KWIC concordancing and its collocation sorting for word search results. There's also lots of stuff for graphing, statistics, parts-of-speech analysis, and sentence decomposition/parsing. Still playing with it. There's a lot of other programs out there for Japanese, but I couldn't find anything that worked well/at all.
For an online but more limited concordancer using tools/data provided by the authors of the above publication links: http://www.someya-net.com/concordancer/index_j.html
There's also a lot of links via the KH Coder page, here:
http://khc.sourceforge.net/
Or rather: http://khc.sourceforge.net/link.html
Edit: Oh! Just as I wrote that I couldn't get others to work (re: Free Japanese collocations), I got Laurence Anthony's AntConc to work: http://www.antlab.sci.waseda.ac.jp/software.html
Thanks to Thora for that last. I tried it before and couldn't get it work, but now it's fine (user error corrected).
Edited: 2011-05-24, 10:29 pm
