cb's Japanese Text Analysis Tool

(2017-04-21, 9:27 pm)tanaquil Wrote: Is it possibly counting all instances of 電子 + all instances of 音? I don't really understand how the analyzer works.

(Also, I just had to use yomichan to verify that the compound should be pronounced denshi-on, not denshi-oto. Sometimes I really hate Japanese.)

If that were the case, 電子 should show up as a separate word on it's own line from 音. Thats really strange. From my understanding the text analyser de-inflects and tokenizes words via mecab or Jparser and then simply counts the tokens.

@ Zarxrax, I wonder what would happen if you switch tokenizers - if you get different results.

Messages In This Thread