Back

Mighty Morphin Morphology

#5
Boy.pockets Wrote:I have just been thinking about making something similar - I had got up to the part of naming it. Smile
I got stuck on that part, thus the boring name for now.

Boy.pockets Wrote:I am interested to know how you are going about detecting the words. My initial thought was to use sentence glossing (via WWWJDIC's glossing feature), but I have no idea how well that would work.
I'm using mecab to determine the morphemes. If you look at morpheme.py you can see the specific options sent to mecab.

Boy.pockets Wrote:I was also thinking it would be cool to be able to consider the grammar of the sentence as well. Say, a sentence that you know is <noun>+desu. So it could use patterns like that to help calculate the distance between two sentences (the n+1 ness).
I considered this as well but then decided it was too difficult to determine whether you 'know' a grammar point, since I've 'seen' a lot of grammar used in various sentences and merely used a simplified understanding of them to get by. It's also not as easy to detect.

That said, I can get the morphemes in order and with parts of speech information, so perhaps someone could look at the morpheme output of some sentences and figure out some simple rules to detect various patterns.
Reply

Messages In This Thread