I updated
the Core 6000 text file on my website so it's now based on the TSV file from this thread instead of kore.txt from
https://sites.google.com/site/ankinihongo/home/kore. I added columns for furigana for the vocabulary items, RTK keywords, first translations of the vocabulary items, word types like two kanji compound or katakana, word frequency, and sentence difficulty based on the frequency of morphemes.
Edit: I have updated the text file on my website again so it's based on the same JSON files as the files in this thread. I copied the furigana for the sentences from Savii's TSV file though.
Some errors or inconsistencies in formatting in the original data:
About 100 hiragana sentences have two consecutive spaces or spaces around punctuation characters
A few fields include a space in the end or two consecutive spaces
Error in the translation for the sentence: 218, 747, 1281, 1632, 2580, 3269, 3367, 4511, 4724, 4923, 5405, 5538, 5966
Error in the translation for the vocabulary item: 1015
Wrong reading in the kanji sentence: 5896
Wrong reading in the hiragana sentence: 4582, 4725
Wrong bold part in the kanji sentence: 831, 4165, 4472
Wrong bold part in the hiragana sentence: 187, 3309, 5482, 5598
Full-width characters in the hiragana sentence: 1176, 1611, 3137, 4525
No thousands separator in the hiragana sentence: 878, 1558
No period at the end of the kanji sentence: 1529, 2914
ASCII space in the kanji sentence: 1782, 5748
Actually there are so few errors that it might be better to just not make any changes to the original data.
I have also made HTML files for reviewing the
vocabulary and
sentences.
Edited: 2014-01-21, 2:45 am