Back

Mecab, Unidic: prioritize a reading over another

#1
I'm using Mecab with Unidic and it's mostly ok for what I'm doing, apart from the fact that it gives 
わたくし over わたし as a reading for 私.
I know that there is a way to change this behaviour but I don't know how.
Reply
#2
I don't know about mecab but if you're using kuromoji you can give it a user dictionary. The format for user dictionaries in the stable version of kuromoji is too weak, but in the unstable/git version you can just use the same format as lex.csv. This is what unnamed japanese text analyzer does with the user dictionary.

Mecab should have something similar. If it doesn't work right you can modify unidic's lex.csv directly but I haven't done anything with that.
Edited: 2018-01-11, 2:40 pm
Reply
#3
Thank you wareya!
Reply
JapanesePod101