Back

KO2001 and iKnow (Group Effort Request)

Ok, just updated the front page. Also requested Ropsta add the credits to the lists he compiled to identify who helped here. I removed the minor lists that were completed to aid use.

Out of interest, anyone know a program or a way to use a spreadsheet to prioritize sorting? What I mean is, given a group of kanji, the program looks for words in a list with those kanji (anywhere in the word) and organizes that list based on that.
Reply
Nukemarine Wrote:Ok, just updated the front page. Also requested Ropsta add the credits to the lists he compiled to identify who helped here. I removed the minor lists that were completed to aid use.

Out of interest, anyone know a program or a way to use a spreadsheet to prioritize sorting? What I mean is, given a group of kanji, the program looks for words in a list with those kanji (anywhere in the word) and organizes that list based on that.
I haven't seen such a thing, but it sounds useful.
Reply
Ok, I guess star office calc has a way to sort of do it, but it's on the first character only. It's a matter of making a user defined sort order. For a larf, I'm going to try to organize 401-2000 of Core Basic, and 2001-4000 of Core Intermediate.

I'll post the order in Google Docs with requests for opinions on how the order came out. To make it a useful list, it'll have numbers to help you organize it via Smart.fm or pseudo KO2k1 method. That way, it's a simple matter of pasting to the full list on google docs, resort to the ko2k1, then import to Anki.

Speaking of which, which thread had a link to the kanji in the KO2k1 order?
Reply
May 16 - 30 : Pretty Big Deal: Save 31% on all Premium Subscriptions! - Sign up here
JapanesePod101
Nukemarine Wrote:Ok, just updated the front page. Also requested Ropsta add the credits to the lists he compiled to identify who helped here.
No problem. I checked the list you made for use as an example. Let me know if I missed something.
Reply
2001.Kanji.Odyssey Word List 36 (526-540)
2001.Kanji.Odyssey Word List 37 (541-555)

Some of the sentences are missing if there weren't any Cerego ones:

539/3
540/1; 540/3
542/3
545/4
550/3
554/2

I can't tell if the user submitted ones are any good, so I've just left them out for now.
Reply
391-405 finished.

http://smart.fm/lists/68276

On to the next set...
Reply
2001.Kanji.Odyssey Word List 34 (496-510)
2001.Kanji.Odyssey Word List 35 (511-525)

Nukemarine, the other 2 lists need to be shifted down a group. I started at the end of the book and worked backwards, not at the end of the group.

Missing sentences:
499/2
502/4a; 5024b
505/2b
508/1 (no sentence with this pronunciation)
509/4

511/1
511/3
515/4
524/3
525/2
Edited: 2009-03-29, 10:16 am
Reply
2001.Kanji.Odyssey Word List 33 (481-495)

Missing sentences:
481/3
482/2
483/4
489/4
Reply
406-420 finished.

http://smart.fm/lists/69168

travis, since you're working backwards, I'll jump ahead and call 556-570.
Reply
pubbie,

Not quite sure how you're doing this as I have a slow connection to check out your work. Now, are you organizing the core2k and core6k vocabulary words using the KO2k1 order then using that order sort the sentences attached to the vocabulary words?

Reason I ask is that it sounds like you script is looking for kanji in the entire sentence then putting that in order. That could mean a sentence can be used upto 10 times if it has 10 kanji in it.

A better method may be: Sort Core2k and 6k Vocabulary words with your script into KO2001- put sorted list in spreadsheet and add a number column (for resorting). With that, it's not too much to sort both Core2k and 6k spreadsheets that have everything (photo, audio, sentence, translations, etc.).

Edit: Can someone compile the next set of completed sentences (361 to 450 I believe). Don't forget to credit who compiled which set.
Edited: 2009-04-01, 4:52 pm
Reply
Let me know if there are any problems with the compiled lists.

http://smart.fm/lists/70065


Mmmmm craaaacckkk....
Edited: 2009-04-01, 5:42 pm
Reply
The problem is that we DON'T have a list of the Kanji in KO2001 order... and it would take AGES to make one up manually. I asked in another thread if someone with a scanner or one of those pens could do this, but no-one responded...

I think this is worthwhile, because there are a lot of people who don't use iKnow/Smart.fm but use ANKI instead...
Edited: 2009-04-02, 5:22 am
Reply
SammyB Wrote:The problem is that we DON'T have a list of the Kanji in KO2001 order... and it would take AGES to make one up manually. I asked in another thread if someone with a scanner or one of those pens could do this, but no-one responded...

I think this is worthwhile, because there are a lot of people who don't use iKnow/Smart.fm but use ANKI instead...
So use the iKnow importer for Anki. Problem solved.
Reply
Pubbie, I don't think it's just me, but others like to learn vocabulary via the sentence method. Others like to be concerned with the entire sentence, I'm concerned mainly with the word then secondarily with the sentence.

Plus, you can be systematic with vocabulary words. I have the top 60,000 words in Japanese which was developed by scanning 250 million characters on websites. Removing kana, romaji, numerals and special characters leaves you with a useful list. However, iKnow sort of kind of has done this already by using vocabulary lists developed by frequency in newspapers. What I wanted was take those frequency based lists, then organize them using the KO2k kanji order. Reason being people say they learn words much, much faster.

Anyway, I recommend that with your program (if it doesn't do this already) is all but the last dupe gets deleted. By last, I mean the sentence or vocabulary word that has the more "difficult" kanji ie latter in the list. So if 水道 comes up, there's an entry by 水 and by 道. So the word will be put later in the list by 道 kanji while the one by water is deleted. If it comes across all kana words or sentences, then those get evenly spaced out in the list.

For Core2k and 6k sentences, recommend the following "Bunches"

Core 2000
First 400
Last 1600 (401-2000)

Core 6000
Next 1000 (2001-3000)
Next 1000 (3001-4000)
Next 1000 (4001-5000)
Last 1000 (5001-6000)

My reasoning: The first 400 are very common words that presented no problem to people based on comments. The next set became a bear very fast. So these 1600 can be organized. After that, I say organize the next groups of 1000 on the idea that you have nice controlled blocks of words for quick learning and usefulness.
Reply
You know, I don't think it even has to be the 2001ko order... It just has to be an order that builds from easy to hard. The official Japanese schooling order would work better than nothing, in fact.

I think I'm going to try to find some time this weekend to write a script that will sort things like you are suggesting, Nukemarine. I don't think it'll be too hard, actually... It'll be harder to optimize it than anything.
Reply
wccrawford Wrote:You know, I don't think it even has to be the 2001ko order... It just has to be an order that builds from easy to hard. The official Japanese schooling order would work better than nothing, in fact.

I think I'm going to try to find some time this weekend to write a script that will sort things like you are suggesting, Nukemarine. I don't think it'll be too hard, actually... It'll be harder to optimize it than anything.
I was thinking the same thing, what with the Kanji in Context order being the most notable one. That tanuki list that had 9000 sentences in KiC order was a gold mine (not to mention having JAPANESE DEFINITIONS OF WORDS) which was what got me on the KO2k thing with iKnow.

To be honest, I know there's a point you stop doing systematic learning with pre-generated material. You don't go past a certain number of kanji, a certain number of grammar points and a certain number of vocabulary words. At a point we do get diminishing returns. After 3000 kanji, 300 grammar points, and 6000 most common vocabulary words learned via the sentence method and RTK, I know it's going to be all passive input past that. Then it's a matter of doing what Khatzumoto suggested at the beginning: plugging in stuff that interests me.

Sorry to go off on a tangent. I'm just getting my internal thinking out there as to why this can be useful and how it can be useful. Alyks over on his rants page makes a good point that we on this board tend to waste a lot of non-Japanese time getting together lots of Japanese material. I'm REALLY guilty of that. However, I see a benefit not only to myself but to many others should this material be there for ready use in whatever manner the user ultimately chooses.
Edited: 2009-04-02, 3:17 pm
Reply
Umm, what happened to pubbie's posts? Did I miss something or was this behind the scenes?
Reply
I just wanted to thank everyone for their efforts making these lists.

I went through KO, and now I'm using the smart.fm lists for listening practice by importing them into Anki. I can read most of the sentences but listening is often really hard for me 1st time round, so I think it will be worth while doing this listening practice.
Reply
pubbie Wrote:wccrawford,
I will also try scripting something that meets the criteria stated by nukemarine. Lets see who gets done first!
On the subject of optimization: my script to generate an anki-deck with sentences for ~900 KO2001-ordered kanji created an output file that was 1,3 Gigabytes big... Worst-case time-complexity... I have a talent for these things.
EDIT: nukemarine, I sometimes delete my own posts due paranoia Sad
I've got a busy weekend, so you'll probably win. Wink In fact, I haven't even decided what language I want to use yet. -sigh- PHP is what I use for work all the time, so it'll probably be that... But I have a feeling Ruby might be a better choice. Or at least more fun.
Reply
Nukemarine Wrote:To be honest, I know there's a point you stop doing systematic learning with pre-generated material. You don't go past a certain number of kanji, a certain number of grammar points and a certain number of vocabulary words. At a point we do get diminishing returns. After 3000 kanji, 300 grammar points, and 6000 most common vocabulary words learned via the sentence method and RTK, I know it's going to be all passive input past that. Then it's a matter of doing what Khatzumoto suggested at the beginning: plugging in stuff that interests me.
I agree. I had already pretty much decided to get through the vocab for 2001ko and then start on grabbing vocab from mangas I'm reading. (Actually, I'm trying to find a way to do both at once, but the mass-imports into Anki make that difficult... Hmm...)
Reply
wccrawford Wrote:
pubbie Wrote:wccrawford,
I will also try scripting something that meets the criteria stated by nukemarine. Lets see who gets done first!
I've got a busy weekend, so you'll probably win. Wink In fact, I haven't even decided what language I want to use yet. -sigh- PHP is what I use for work all the time, so it'll probably be that... But I have a feeling Ruby might be a better choice. Or at least more fun.
Sadly, PHP doesn't seem to be up to the task. I had it just about done and then realized that PHP's bad UNICODE handling keeps messing it up. At least, i'm pretty sure that's the problem. I'll probably have to use Ruby instead.
Reply
pubbie Wrote:
wccrawford Wrote:Sadly, PHP doesn't seem to be up to the task. I had it just about done and then realized that PHP's bad UNICODE handling keeps messing it up. At least, i'm pretty sure that's the problem. I'll probably have to use Ruby instead.
PHP-code is almost exactly like perl code, you could probably easily run what you have with the perl interpreter. Or if you want you could post the code somewhere so I can have a looksie Big Grin
Perl is crap compared to Python and Ruby though. I mean, PHP is easiest to use so it's recommended but if you're going to switch the language used anyway, why not go for something superior.
Reply
I agree that Python & Ruby are nice. But their unicode support kinda lets you down.

Perl's unicode support is flawless. But then you gotta take care not to write "write-only" code.

How is PHP with unicode?
Edited: 2009-04-05, 7:18 am
Reply
Tobberoth Wrote:
pubbie Wrote:
wccrawford Wrote:Sadly, PHP doesn't seem to be up to the task. I had it just about done and then realized that PHP's bad UNICODE handling keeps messing it up. At least, i'm pretty sure that's the problem. I'll probably have to use Ruby instead.
PHP-code is almost exactly like perl code, you could probably easily run what you have with the perl interpreter. Or if you want you could post the code somewhere so I can have a looksie Big Grin
Perl is crap compared to Python and Ruby though. I mean, PHP is easiest to use so it's recommended but if you're going to switch the language used anyway, why not go for something superior.
Or why not go the whole way and learn C# & .NET and join the pro's in 2009?
Wink
Edited: 2009-04-05, 7:22 am
Reply
Lol, and why not COBOL then?
Reply