Sorry for making this a new thread instead of adding to the "Software" thread. This is an alpha-level port of MeCab to Javascript, so that it can run in any recent web browser, and I'm soliciting bug reports and feature requests.
Please try it out at http://fasiha.github.io/mecab-emscripten/
Background MeCab is technically a morphological analyzer and part-of-speech tagger. What that means to me, as a non-linguist, is "it puts spaces between Japanese words (-O wakati), it converts Japanese to katakana (-O yomi), it breaks down entire sentences into parts-of-speech (-O chasen), and through some crazy magic, Damien Elmes' Japanese Support Anki plugin can add reasonably accurate furigana to Japanese text (not yet supported since it uses Kakasi, another tool, in conjunction with MeCab, with some Python glue inbetween)".
It's usually a pain in the a$$ to install. A kind Koohiite helped me (and many others) by making a video tutorial on getting it set up in Windows 7 (
). You know when someone has to make a video tutorial on installing and using a piece of software that we're still in the 1990s.
So as a mini-project I put it through the Emscripten cross-compiler, which compiled the C++ source code to Javascript, so now it runs in your Firefox, Chrome, Safari, and Chrome on iPhone (the ones I've tested so far that work; Safari on iPhone doesn't work, yet). It takes a few seconds to download the 50MB dictionary, but once it's ready, type/paste some 日本語, enter a flag like "-O chasen", click Submit, and get your result.
It's worked pretty well on all the input I've given it but I'm sure there's flaws to fix and improvements to add. Please feel free to post here if you don't want to make a Github account to post on the bugtracker there (https://github.com/fasiha/mecab-emscripten/issues).
Please try it out at http://fasiha.github.io/mecab-emscripten/
Background MeCab is technically a morphological analyzer and part-of-speech tagger. What that means to me, as a non-linguist, is "it puts spaces between Japanese words (-O wakati), it converts Japanese to katakana (-O yomi), it breaks down entire sentences into parts-of-speech (-O chasen), and through some crazy magic, Damien Elmes' Japanese Support Anki plugin can add reasonably accurate furigana to Japanese text (not yet supported since it uses Kakasi, another tool, in conjunction with MeCab, with some Python glue inbetween)".
It's usually a pain in the a$$ to install. A kind Koohiite helped me (and many others) by making a video tutorial on getting it set up in Windows 7 (
). You know when someone has to make a video tutorial on installing and using a piece of software that we're still in the 1990s.
So as a mini-project I put it through the Emscripten cross-compiler, which compiled the C++ source code to Javascript, so now it runs in your Firefox, Chrome, Safari, and Chrome on iPhone (the ones I've tested so far that work; Safari on iPhone doesn't work, yet). It takes a few seconds to download the 50MB dictionary, but once it's ready, type/paste some 日本語, enter a flag like "-O chasen", click Submit, and get your result.
It's worked pretty well on all the input I've given it but I'm sure there's flaws to fix and improvements to add. Please feel free to post here if you don't want to make a Github account to post on the bugtracker there (https://github.com/fasiha/mecab-emscripten/issues).
Edited: 2014-09-17, 11:05 pm
