Hello,
I have just released version 4.0 of cb's Japanese Text Analysis Tool.
Download cb's Japanese Text Analysis Tool v4.0 via SourceForge
What Changed?
● Added the User-based Readability Report option.
Using a list of words that the user already knows, this report can help to determine readability of a text based on the percentage of words in the text that the user already knows.
Name: user_based_readability_report.txt
Format:
Field 1: Readability expressed as a percentage (0-100) of the total number
of non-unique known words vs. the total number of non-unique words.
Field 2: Total number of non-unique words
Field 3: Total number of non-unique known words
Field 4: Total number of non-unique unknown words
Field 5: Readability expressed as a percentage (0-100) of the total number
of unique known words vs. the total number of unique words.
Field 6: Total number of unique words
Field 7: Total number of unique known words
Field 8: Total number of unique unknown words
Field 9: Filename
Report is sorted based on Readability (Field 1).
To generate this report, the "File that contains a list of words that you already know" option must be filled in. If a line contains multiple tab-separated columns, then the word is assumed to be in the first column.
● Renamed the old Readability report to Formula-based Readability report.
● Updated Mecab to version 0.996.
cb4960
I have just released version 4.0 of cb's Japanese Text Analysis Tool.
Download cb's Japanese Text Analysis Tool v4.0 via SourceForge
What Changed?
● Added the User-based Readability Report option.
Using a list of words that the user already knows, this report can help to determine readability of a text based on the percentage of words in the text that the user already knows.
Name: user_based_readability_report.txt
Format:
Field 1: Readability expressed as a percentage (0-100) of the total number
of non-unique known words vs. the total number of non-unique words.
Field 2: Total number of non-unique words
Field 3: Total number of non-unique known words
Field 4: Total number of non-unique unknown words
Field 5: Readability expressed as a percentage (0-100) of the total number
of unique known words vs. the total number of unique words.
Field 6: Total number of unique words
Field 7: Total number of unique known words
Field 8: Total number of unique unknown words
Field 9: Filename
Report is sorted based on Readability (Field 1).
To generate this report, the "File that contains a list of words that you already know" option must be filled in. If a line contains multiple tab-separated columns, then the word is assumed to be in the first column.
● Renamed the old Readability report to Formula-based Readability report.
● Updated Mecab to version 0.996.
cb4960
