Back

Mighty Morphin Morphology

Does anybody know what dictionary MeCab uses 'under the hood' in the Anki & morphman installation?

I'm asking because for a work project I've found that the 'neolog' dictionary (https://github.com/neologd/mecab-ipadic-neologd) is much more useful than the out-of-the-box ipadic that MeCab normally uses, at least if you do an installation for Python. The ipadic does a rather poor job at splitting morphemes that the neologd, i.e. too many four-kanji or two-kanji words becomes individual two- or one-kanji morphemes.

I suspect that might improve what I use Morphman for: extract the morphemes from a text, compare those with my 'known' morphemes, and then only learn the new ones in optimal order.
Reply
(2017-10-19, 10:59 am)Irixmark Wrote: Does anybody know what dictionary MeCab uses 'under the hood' in the Anki & morphman installation?

I'm asking because for a work project I've found that the 'neolog' dictionary (https://github.com/neologd/mecab-ipadic-neologd) is much more useful than the out-of-the-box ipadic that MeCab normally uses, at least if you do an installation for Python. The ipadic does a rather poor job at splitting morphemes that the neologd, i.e. too many four-kanji or two-kanji words becomes individual two- or one-kanji morphemes.

I suspect that might improve what I use Morphman for: extract the morphemes from a text, compare those with my 'known' morphemes, and then only learn the new ones in optimal order.

It's ipadic as far as I'm aware. The mecab executable and its dictionaries are bundled with the Japanese support plugin (not included in Anki/MorphMan). Everything is in "<User>/Anki/addons/japanese/support".
Reply
(2017-10-20, 1:50 am)kaegi Wrote:
(2017-10-19, 10:59 am)Irixmark Wrote: Does anybody know what dictionary MeCab uses 'under the hood' in the Anki & morphman installation?

I'm asking because for a work project I've found that the 'neolog' dictionary (https://github.com/neologd/mecab-ipadic-neologd) is much more useful than the out-of-the-box ipadic that MeCab normally uses, at least if you do an installation for Python. The ipadic does a rather poor job at splitting morphemes that the neologd, i.e. too many four-kanji or two-kanji words becomes individual two- or one-kanji morphemes.

I suspect that might improve what I use Morphman for: extract the morphemes from a text, compare those with my 'known' morphemes, and then only learn the new ones in optimal order.

It's ipadic as far as I'm aware. The mecab executable and its dictionaries are bundled with the Japanese support plugin (not included in Anki/MorphMan). Everything is in "<User>/Anki/addons/japanese/support".

Thanks. It seems that it's going to be difficult to change without possibly breaking the add-on, but I'll fiddle with it and see if it makes a difference.
Reply
Thanksgiving Sale: 30% OFF Basic, Premium & Premium PLUS Subscriptions! (Nov 13 - 22)
JapanesePod101
Hey guys! I've been learning Korean for a while and I've just stumbled upon MM. But after 2-3 tries I haven't understood exactly how to make it work/configure it so it works with my subs2srs sentences. Can anyone guide me, please? Thank you!
Reply
Hey ethereal. You're trying to get it working with your Korean sentences, is that right? Korean is spaced is my understanding, so, provided you don't mind losing word family recognition, the Spaced Delimiter should be sufficient. Can you explain what your particular issue is that you're having?
Reply
(2017-10-25, 1:51 am)NinKenDo Wrote: Hey ethereal. You're trying to get it working with your Korean sentences, is that right? Korean is spaced is my understanding, so, provided you don't mind losing word family recognition, the Spaced Delimiter should be sufficient. Can you explain what your particular issue is that you're having?

Hi, NinKenDo, thanks for answering! 

Yes, I'm trying to make it work with my Korean sentences, because as you said, I understood that with the Space Delimiter should work. But I am not sure how I need to configure it to.. "make it work" I guess? 

I haven't found a guide that explain exactly how do I make it work. I know I need to tweak the names of fields, as mine are different (predetermined "Expression" is my "Reading" field, for example, where the actual Korean sentence goes in). MorphMan creates also tags with different names. I am unsure of how to make MM understand which morphemes (words) I do understand, for example. There's also info in the wiki that indicates to create 3 different fields, i.e: one for i+1, and sort them by number. 

For example, this is one of my subs2srs cards:
[Image: AECeIhz.png]

But I am unsure of what do I ahve to do with them..

In general I don't really know how to make it work completely and I'm also afraid of getting everything wrong. 

Thank you again for replying Smile
Edited: 2017-10-25, 8:03 am
Reply
No worries. So, what you need to do is create a field in your notes called Focus Morph and one called MorphMan Index. Once you've done that, click Tools -> MorphMan Preferences. In the Note Filter page you should set Note type to the note type you use for your Korean sentences. Under Fields give it the field for your expression, in this case it's "Reading" I guess. Morphemizer should be set to "Language with spaces". And Modify should be ticked.

Then go to the Extra Fields tab and set Focus morph (*) to "Focus Morph" and MorphMan Index to "MorphMan Index".

Click Apply, then go to Tools -> MorphMan Recalc and wait.

Let me know how you go, and if anything was unclear.
Edited: 2017-10-26, 4:34 am
Reply
(2017-10-26, 4:33 am)NinKenDo Wrote: No worries. So, what you need to do is create a field in your notes called Focus Morph and one called MorphMan Index. Once you've done that, click Tools -> MorphMan Preferences. In the Note Filter page you should set Note type to the note type you use for your Korean sentences. Under Fields give it the field for your expression, in this case it's "Reading" I guess. Morphemizer should be set to "Language with spaces". And Modify should be ticked.

Then go to the Extra Fields tab and set Focus morph (*) to "Focus Morph" and MorphMan Index to "MorphMan Index".

Click Apply, then go to Tools -> MorphMan Recalc and wait.

Let me know how you go, and if anything was unclear.

Ok, I did that and it seems it's working well. The database manager shows this (I haven't touched anything just in case), though (I hope it is what it should be): 
[Image: DaaMdJF.png]

Now I would like to tell MorphMan to leave aside the sentences of which I already know the meaning. For example, 진짜 means 'really', and I would like to ignore the phrases that have just that word. I think it's just pressing "K" when you are in that card, right?
Reply
Yeah, the database manager is kinda it's own thing, it should indeed look like that when you open it up. I should probably make some videos or something to explain how to use MM and this stuff better. Yes, pressing "K" for that will do the trick if you don't mind doing it manually like that. It will also eventually happen just from reviewing those sentences enough, or else you CAN use the DB Manager to achieve the result quicker, but you could really mess things up fiddling with it before you understand it properly, so I would recommend you just manually mark cards as known like you suggested.
Reply
(2017-10-26, 10:42 am)NinKenDo Wrote: Yeah, the database manager is kinda it's own thing, it should indeed look like that when you open it up. I should probably make some videos or something to explain how to use MM and this stuff better. Yes, pressing "K" for that will do the trick if you don't mind doing it manually like that. It will also eventually happen just from reviewing those sentences enough, or else you CAN use the DB Manager to achieve the result quicker, but you could really mess things up fiddling with it before you understand it properly, so I would recommend you just manually mark cards as known like you suggested.

Yeah, I prefer not to touch the database and stuff like that just in case. Thank you so much NinKenDo! I have about 6500 sentences now, so I'm sure MM will help with this  Big Grin

Thank you!
Reply