Back

For FLTR users, this program will conjugate verbs for you

#26
Thank you for the detailed replies. It would be great if such a tool could be developped… and run in Mac OS too ;-)
Reply
#27
aldebrn Wrote:LingQ is only a tiny bit better than FLTR or LWT in this regard. It runs MeCab on the input text but doesn't reduce the word to its stem nor does it do any de-conjugation. So the definitions it shows for a "word" are often totally incorrect, e.g., in 「家 で 仕事 を する ように なり」, MeCab says なり the (infinitive) form of なる, but LingQ doesn't understand that so the definitions it shows are the ones people have made for なり (particle, noun, etc.). I don't mind paying $10/mo and being online to use lingq, and really like the approach and some of the material on it, but it's Japanese NLP is too primitive to be usable. And that's understandable because Japanese NLP is hard.

But I definitely think you can make a much better app that's tailored to Japanese. I've been working on one recently. It uses Ve as a linguistic frontend to MeCab. Ve is free open source software by Kimtaro who runs jisho.org, and indeed, Ve is exactly what parses sentences on beta.jisho.org (try pasting the above sentence into beta.jisho.org). In fact, I've tweeted LingQ asking them to integrate beta.jisho.org into LingQ because it is exactly what they need: very smart parsing and recombination of morphemes (raw mecab output) into Japanese words, de-conjugation of verbs, integration with all the dictionaries, etc. Kimtaro is working on a REST API for jisho.org which will be a huge help for writing Japanese LingQ clones, but right now it's not ready (tweet him asking about it Smile). The one thing, which is very nice on beta.jisho but that Kimtaro hasn't yet open sourced, is the verb de-conjugater, which does things like helpfully suggest things like "嫌いたくなくて looks like an inflection of 嫌う, with these forms: Nai-form. It indicates the negative form of the verb"---he will release it when he has time to do some cleanup with it. Maybe in the meantime Cronos's software here can be used for de-conjugation? But with Ve and MeCab, you would just (haha, "just") need to write the code to interface with JMdict and the other dictionaries, plus all the front-end stuff. I'm writing it as a webapp like LingQ (with mobile-ready datastores so you can use it on a phone while offline, and it'll sync up with the server when it gets network back), but obviously you can easily make a non-internet version too.

Sorry for this incoherent post, but to summarize it, I definitely think there's a big opportunity to make a much-improved version of LingQ for Japanese, powered by high-quality NLP tools that are all free and open source. I'd love to help someone work on this or bounce ideas and suggestions with.

Edit: I'd discussed this topic a few days ago: http://forum.koohii.com/showthread.php?p...#pid217019
I've been learning a lot with Beta Jisho, and also noticed how incredibly helpful the sentence decontructor was. I was actually wondering yesterday if they shared any of their algorithms open source. So that's good to know.

This sort of niche program is definitely a great opportunity for any programmer.
Reply
#28
Ve/MeCab/beta.jisho.org make enough wrong or ~so-so sentence splits (even one confusing split every huhdred sentences is I think too many) that I've concluded that it's best for a teacher or writer posting on lingq or a clone to supervise the automatic sentence segmentation and make sure the dictionary definitions linked to each word is correct (right kanji, right reading, right sense). I've bought some ~cheap graded readers on Kindle like 'Hikoichi' by Clay and Yumi Boutwell for this purpose, until I get enough skill to do this myself, or I kidnap some fluent speakers and make them do this for us Smile

Don't worry @jmigot, I use all three desktop operating systems plus iOS daily and have zero patience for switching devices/computers just to use an app. Javascript forever! (This is basically the whole idea of microservices: write servers that speak only in JSON data to other servers and clients, and then put all the display logic in the client (a browser or an app).)
Edited: 2015-01-23, 6:03 am
Reply
May 16 - 30 : Pretty Big Deal: Save 31% on all Premium Subscriptions! - Sign up here
JapanesePod101
#29
I like this program and it is fun to use to help produce lists of words and words in context that you don't know or want to learn better. However, as a text reader, I find it inconvenient that it has no search feature or bookmark feature for finding your place after you close. It always opens at the very first page of the document. Is there some workaround that you may know of for this shortcoming? I would think that to be able search and bookmark is a pretty basic feature of any reading software and for me it is a pretty big downside to this program for actually using it to read for fun.
Edited: 2015-08-01, 9:41 am
Reply