I'm trying LingQ and I like very much the fact that the new words are highlighted in blue while the words you're learning are highlighted in yellow. But, as you already know if you've tried LingQ, it treats every cojnugation as a separate word. So I was thinking to do something similar, but without this negative aspect.
The thing is I don't understand how it works for japanese behind the scenes. If someone would help me understand this it will be much appreciated
Here are my perplexities:
1) When you import a text, it separates the words in a default way, I think with a mecab extension. And so far no problem. But it gives the option to modify this default behavior.
For example, if I import a text which contains the word "飲みます", LingQ parses the word like this: "飲み" - "ます". It treats it as two words. But if I underline the two words "飲み" and "ます" together, it gives me the option to treat it as a single word and add a lingq for it. So I add a lingq for the word "飲みます" and from that moment on, every other instance of "飲み ます" in that and other "lessons" is treated automatically as a single words instead of two as happened before. How is it possible if every lesson is parsed via Mecab (or, at least, I'm supposing this but maybe I'm wrong)? How does Mecab know that from now on it must parse that word as a single word? It relies upon a glossary or what? When you add a lingq that modifies the default way in which a word is parset, it modifies the entry for that word in the glossary so that from now on Mecab will parse it that way?
It wouldn't be difficult to do something similar but which treats every cojnugation as a single word because of the capacity of Mecab to decojnugate words (or am I remembering wrong?), but I would like to add the option for the user to modify the way a word is parsed as Lingq does, because by default Mecab break words in a way that is not optimal I think... So, if someone know the mechanism behind this it would be awesome
The thing is I don't understand how it works for japanese behind the scenes. If someone would help me understand this it will be much appreciated

Here are my perplexities:
1) When you import a text, it separates the words in a default way, I think with a mecab extension. And so far no problem. But it gives the option to modify this default behavior.
For example, if I import a text which contains the word "飲みます", LingQ parses the word like this: "飲み" - "ます". It treats it as two words. But if I underline the two words "飲み" and "ます" together, it gives me the option to treat it as a single word and add a lingq for it. So I add a lingq for the word "飲みます" and from that moment on, every other instance of "飲み ます" in that and other "lessons" is treated automatically as a single words instead of two as happened before. How is it possible if every lesson is parsed via Mecab (or, at least, I'm supposing this but maybe I'm wrong)? How does Mecab know that from now on it must parse that word as a single word? It relies upon a glossary or what? When you add a lingq that modifies the default way in which a word is parset, it modifies the entry for that word in the glossary so that from now on Mecab will parse it that way?
It wouldn't be difficult to do something similar but which treats every cojnugation as a single word because of the capacity of Mecab to decojnugate words (or am I remembering wrong?), but I would like to add the option for the user to modify the way a word is parsed as Lingq does, because by default Mecab break words in a way that is not optimal I think... So, if someone know the mechanism behind this it would be awesome

