Joined: Aug 2006
Posts: 39
Thanks:
0
it just represents a linear growth, I think it's there for illustration purposes. I assumed it meant a constant function, which sounded intuitive, although I have no clue if that's what it's called in English. Sorry.
Edited: 2010-04-17, 5:02 pm
Joined: Aug 2009
Posts: 34
Thanks:
0
the number of unique kanji is constant. on the graph it looks like for every 1000 sentences there are approx 280 new kanji and this doesn't change, so at 2000 sentences there will be 560 kanji and so on.
I think...
Edit: gah, Kubelek beat me...:)
Edited: 2010-04-17, 5:01 pm
Joined: Dec 2006
Posts: 364
Thanks:
0
the slope of the graph is the rate of new kanji introduction per sentence, so a flat horizontal line is no new kanji being introduced
nuke is nukemarine's suggestion of dividing into 3 sections, core2k steps 1-2, core2k steps 3-, and core6k, and sorting by ko2001 order within those sections
because unsorted and nuke cover the same sentences within those sections, they have to meet at those points
from the discontinuities in the unsorted slope at those points, you can see that they are the only real divisions within the original ordering
fancynuke is the same division into 3 sections, but within those sorting by kanji used in the previous section (sorted by frequency in the current section), then the remaining kanji in the current section by frequency, basically delaying introducing new kanji as much as possible and hence flattening the graph
I assume the flat section in ko2001 is due to finally reaching a common kanji that is used in many sentences containing only previously seen kanji
const isn't an actual sort order, but is the average rate for the whole collection, and is there as a reference for what I think would be the ideal -- a constant rate of new kanji introduction
Joined: Jul 2007
Posts: 2,313
Thanks:
22
**Edit: guess I replied while Cangy was posting his. Eh, it happens**
Well, I think he's referring to my moniker since I was asking about sorting Core 2k/6k according to 2001KO order.
The "Nuke" is sorting the entire Core 2k/6k according to 2001KO order (then RTK if into 2k1KO). The "Fancy Nuke" refers to the list where you study the first 400 sentences of Core 2k (sorted), then the next 1600 of Core 2k (sorted), then the Core 6k sorted. It works on the idea that there are common words using more advanced Kanji that should be learned before the less common words using simpler kanji. Of course, that means you don't get new kanji in later groups of sentences till a bit later in Core 2k steps 3-10 and even later in Core 6k.
Edited: 2010-04-18, 5:01 pm
Joined: Jan 2010
Posts: 247
Thanks:
0
@cangy, @Nukemarine
Many thanks! That clarifies a lot!
Edited: 2010-04-18, 12:06 pm