![]() |
|
Is there a tool that can break apart a Japanese sentence? - Printable Version +- kanji koohii FORUM (http://forum.koohii.com) +-- Forum: Learning Japanese (http://forum.koohii.com/forum-4.html) +--- Forum: Learning resources (http://forum.koohii.com/forum-9.html) +--- Thread: Is there a tool that can break apart a Japanese sentence? (/thread-11704.html) |
Is there a tool that can break apart a Japanese sentence? - vgambit - 2014-03-19 That is, I have a sentences deck. I would like to take from that deck all of the sentences that I know, and break them up, word by word, so that I can use cb's frequency report generator to compare them with my Japanese ebooks. Essentially, so I can figure out how readable a novel is before I dive into it. The problem is, cb's tool needs each word to be on its own line; for some reason, even though it seems to do so for frequency analysis, the readability analysis tool does not automatically break apart sentences into words. So essentially what I'm asking for is a tool that can parse out words from a list of sentences, and spit out a (non-repeating) list of those words. I'm a programmer, so I can make it myself if anything, but I'd hate to repeat the work if someone else has already done it. Is there a tool that can break apart a Japanese sentence? - unauthorized - 2014-03-19 If you are not afraid of the command line, mecab can do exactly what you want. See http://sourceforge.net/projects/mecab Is there a tool that can break apart a Japanese sentence? - Gareth - 2014-03-19 MeCab can split sentences into words. You could probably write something simple to run a bunch of sentences through and remove duplicates. Is there a tool that can break apart a Japanese sentence? - vgambit - 2014-03-19 And cb's tool uses MeCab... so why can't it also apply it to user files for readability analysis? lol. Is there a tool that can break apart a Japanese sentence? - afterglowefx - 2014-03-19 Read this thread, it appears to be exactly what you want to do. http://forum.koohii.com/showthread.php?tid=11611 And looking at it again I forgot to get back to sholum, whoops. Is there a tool that can break apart a Japanese sentence? - cb4960 - 2014-03-19 Using your exported sentences as input, you could generate a Word Frequency report with cb's Japanese Text Analysis Tool. The second column of this report will contain the unique words. Is there a tool that can break apart a Japanese sentence? - vgambit - 2014-03-26 cb4960 Wrote:Using your exported sentences as input, you could generate a Word Frequency report with cb's Japanese Text Analysis Tool. The second column of this report will contain the unique words.Awesome. I will try to get around to this tonight. Thank you. Is there a tool that can break apart a Japanese sentence? - vgambit - 2014-04-03 Finally got around to it. The readability rating of the books I'm interested in is at around 3%. I think I'll get back to those in a year or so, lol. |