I am chipping away through Core6K right now, about 4K+some sentences through. I re-ordered the sentences a while ago to get through all kanji contained in the Core6K set, and my "KanjiStats" display info now says a total of 1616 unique kanji have been encountered. Details here:
1616 total unique kanji.
Old Jouyou: 1519 of 1945 (78.1%).
New Jouyou: 63 of 191 (33.0%).
Jinmeiyou (reg): 22 of 645 (3.4%).
Jinmeiyou (var): 0 of 145 (0.0%).
12 non-jouyou kanji.
Grade 1: 80 of 80 (100.0%).
Grade 2: 159 of 160 (99.4%).
Grade 3: 198 of 200 (99.0%).
Grade 4: 198 of 200 (99.0%).
Grade 5: 181 of 185 (97.8%).
Grade 6: 165 of 181 (91.2%).
JuniorHS: 538 of 939 (57.3%).
I've been floating around this forum for a while and have read so much discussion about Core6K/KiC/KO2001 I've simply lost track of all the related projects, criticisms and what not.
So my question is: Has any group effort been organized and completed (or currently underway) to systematically make an edited expansion of Core6K to cover all the Joyo Kanji (old + new Joyo)?
So, I've started this project myself (and am a whole 25 kanji in) but would be willing (and am obviously interested) in turning it into a group effort if anyone else is interested. Before flames like "KiC has all the joyo readings, dumbass!" I have some comments:
1) I call this Core6K Expansion because I think similar guiding principles should be used for the sentences: Simple, short, focused on a single vocab item or two with demonstration of common readings.
2) A group effort could get this done quickly -- I am currently doing 6 Kanji per day via Jim Breen's dictionary and alc. Ten people could compile a quality set of sentences within two weeks.
3) Seems like a reference list many people could profit from.
4) I've tried using RhinoSpike and have thus far gotten 1 (really nice quality!) recording of a sentence. I don't know what sort of luck people have had with RS, but if we split up our sentences we will automatically get:
a) OK to good quality audio reading of each sentence
b) A sentence proof-read by a native Japanese speaker
c) A final product that gives native, natural audio for all Joyo Kanji (when combined with Core6K).
(The audio part would be a perk)
5) Finally: I think one other guiding principle should be efficiency. I don't know what the proper expectation is, but at ~2 sentences per remaining kanji character I'm expecting to get about 1000 sentences. I could easily see this exploding into a 1500 to 2000 sentence thing, which might not be too bad, really, but it might be a bit excessive. A final editing step may be necessary to whittle down the collection.
Alright, sorry for being long-winded about this. If there is considerable interest from the community let's get a spreadsheet going and talk a little bit about organizing, otherwise I'll keep at it myself one character at a time and share the results in a few months when I finish
I've seen some "fan-made Core 1000" project threads, but I don't think anything ever came from them (someone correct me if I'm wrong).
I really like the idea of an expansion to Core6k, and would likely find it useful when done.
My only suggestion would be to maybe use something besides EDICT and Eijirou (alc's dictionary) as sources of example sentences, since they're user-contributed and might contain errors. I usually use Kodansha's 大 dictionary, the 中 version of which is accessible from Weblio but another professional dictionary like Progressive would work as well.
Good suggestion Bokusenou! It's totally obvious but I never thought of using my old Japanese-English dictionary (book) on my book shelf. Each word has some example sentences in it. Also, the massive "grammar sentences" anki deck will presumably have quality-controlled content (apart from typos) -- the latter may have sentences that are too long and complicated for good SRS material, but that judgment could be made by the contributor.
Are there any other sources that have quality assured English translations? I have this Kodansha's Japanese-English dictionary, there's the Grammar Sentences anki deck (content taken from the 3 published grammar dictionaries), the Japanese Sentence Patterns for Effective Communication spreadsheet...
The way I have it setup now is I took the remaining kanji not included in Core6K (old + new Joyo) and split them into groups of 6 kanji. Anyone that wants to help volunteer can sign up for a group of 6 kanji at a time and then file their sentences into the spreadsheet. There are about ~90 groups of 6 kanji, so if we can get a few people doing one group of kanji per day we can finish this up pretty quickly. After it's done we can use the lookup-sentences script to fill a vocabulary field.
Any interested people please send me a private message so I can give you access to the google doc. I put the first sentences I've worked on into a spreadsheet. If you want to edit and contribute please message me, otherwise you can publicly view the content here as it grows:
https://spreadsheets.google.com/ccc?key … y=CNyDw8AB
Organizational comments also welcome.
Remember though that the Core series is not based on filling in the jouyou kanji with vocabulary. It's a list of the most common words found in newspapers. Then you have sample sentences for those words.
For example, here's a list I made by taking a Frequency table (top 15000 Japanese words on the web), and removing direct copies of Core words. What's left is about 10000 words. Now, about 1000 or so are grammar type words or common okurigana which you can choose to skip over.
Frequency list minus Core
It's a simple matter to take that 10k word list, split into five groups of 2000 each. Then sort each of those with Cangy. After that, just add to Anki with a sample sentence plug-in. Not perfect, but an easy way to follow up Core 2k/6k if you really have to do vocab lists.
Have you seen the Japanese CorePlus deck on Anki? (I think it is based on a most common words list)
When I finished Core 6000, I used that deck to learn all the JLPT 1/2/3/4 tagged words. (I imported my existing Core6000 decks in that deck and I trimmed down the number of fields a bit, since it is so bulky, I only have 3 fields left, simply expression, reading and meaning, but you can do whatever you like).
After that you can learn them all in principle, but I decided I was tired of learning words (and my vocab became disproportionally good, while my comprehension lagged behind), so now I just unsuspend words I come across in the 'wild'.
Btw I think that you should not focus on Kanji, but on words. In the beginning I focused on learning readings for all the Kanji aswell, since you have a real progress measure, but in the end l think it is about comprehending what is actually used, and not about being able to read Kanji that no one uses. Just my 5 cents.
EDIT: Seems you have a bit of different take on vocab learning, since you use audio and sentences and all. I did that at first aswell, but went on to straight vocab because it is more efficient. Anyway, sorry if I missed your point a bit, but maybe my post can help you a bit anyway.
Last edited by kame3 (2011 January 03, 9:53 pm)
edit: in my tired state, I posted in the wrong thread.
Last edited by Blahah (2011 January 04, 11:07 am)
My interest (and this is a joint response to the CorePlus suggestion and NukeMarine's) with this small project is to smooth out readings of the 常用漢字 -- they are called 常用 for a reason and I see both Core6K's and KO2001's decision to stop at some arbitrary frequency cutoff a bit frustrating. I think it's important to recognize the difference between frequency and familiarity. There are many words that are familiar but infrequent and thus never make it into these frequency cutoff lists. So I don't see familiarizing oneself with all learning "readings of Kanji nobody uses" but rather "readings of words that are not used frequently but everyone knows how to read".
Right now as it stands the remaining ~1700 Core6K words I have are going to be dead easy, I know it. a couple hundred sentences after having completed exposure to all unique kanji in the series I am pretty much familiar with how to read every sentence I come across (barring the occasional new reading of a learned character) -- the difficulty is, as you pointed out Kame, comprehension. I am increasingly finding myself reading words without difficulty and with good accuracy but having little idea what their precise definition is. That's fine. The reality of that is something only solved by the final product of a large vocabulary, to which NukeMarine's suggestion is close to ideal, or the word lists approach for a slightly more streamlined approach.
I am simply interested in making an efficient, systematic set of sentences that take go from Core6K's 1616 kanji to all of the Old+New Joyo Kanji in as few sentences as possible. I see this as a modest expansion of Core6K that is targeted and easily conquerable and will 1) not be a huge study burden to the student and 2) gives a good foundation to reading.
I am going to continue with this project myself and I think it would be great if I could get some help -- I won't take that long with a few people doing a few sentences per day.
As a side comment: One of my biggest frustrations in reading native printed material is coming across compounds that have kanji I don't know readings of, which means I have to stroke count or primitive-based lookup or some other round about method of getting the definition. None of those options are too hard, but if you're looking up at least 2 to 5 words per page it can be a helluva lot more cumbersome than looking at an unfamiliar word and immediately knowing how to type it into your computer and get the definition, which is a simple 5-second process. This is one motivator for me to do this project.
This should help then. It's based on the MEXT list, so it's jouyou based. Sheet Two has the vocabulary list. Again, just do a quick sweep of words you already did with Core and the dupes. After that, it's the not so small matter of studying.
https://spreadsheets.google.com/ccc?key … l=en#gid=1
This sounds like a really good idea. In addition to the reasons you all ready mentioned it will also mean you can stop RTK reviews, which seems like a big deal to me as I find them really boring.
I was thinking of doing something similair once I finished Core 6k, but I don't really want to do it until then. (I'm about 850 sentences in) If your still doing it when I've finished I'll definitely help out. If not I guess the only thing I have to say is thanks for sharing ^^.
If your Japanese speakers you could try asking on Lang8. I'm sure some people there would be interested in a Japanese/English trade. The guy mentioned in http://forum.koohii.com/viewtopic.php?p … 40#p128440 seems like a good bet.
Last edited by Splatted (2011 January 05, 2:25 pm)