Back

Mighty Morphin Morphology

#10
overture2112 Wrote:Problem: given a kanji compound, how can you determine which parts of the reading are from which kanji?
I didn't find a great solution, but I uploaded a new version of the plugin that can assign a ranking to vocab (stored in 'vocabRank' field) that seems useful enough to be of benefit.


-- For nerds:
Basically the scoring works like this:

It looks through each kanji of each morpheme in the Expression field (skips kana) and gives +20 pts if you know a word with that kanji and an extra +50pts if they share the same position within their respective words. If they share position, it also looks at the first and/or last character of the reading to see if they're the same as well and gives a +100pt bonus if so- but it only does this if the kanji in question is at the very beginning or end of the word.

That is, we assume the first character of the reading is from the first character of the expression, the last character of the reading is from the last character of the expression (unless either were kana), and that, for a given kanji, readings that start with the same character are the same. Obviously not the best solution, but it's a start.

If you know a word with a higher score it uses that score instead, and it averages over the number of characters considered (ie, non-kana).

Finally, all the scores for the morphemes in your expression field are averaged (kana-only morphemes are ignored) and an expression field with only kana-only morphemes gets a total score of -10.
-----

Thus: The new 'vocabRank' feature does a great job of identify words which are easy to learn but doesn't necessarily find all of them (ie, low score words could still be plenty easy due to the positional dependance and other limitations with respect to readings).
Reply

Messages In This Thread