![]() |
|
Interest in developing Hanzi.Odyssey.3501? - Printable Version +- kanji koohii FORUM (http://forum.koohii.com) +-- Forum: Learning Chinese (http://forum.koohii.com/forum-17.html) +--- Forum: Chinese and Hanzi (http://forum.koohii.com/forum-20.html) +--- Thread: Interest in developing Hanzi.Odyssey.3501? (/thread-13127.html) |
Interest in developing Hanzi.Odyssey.3501? - mafried - 2009-05-31 There's a lot of people here that are also learning Mandarin Chinese. I don't need to explain the benefits of KO2001 to anyone here. But unfortunately there isn't anything quite like it for Chinese (if there is, please let me know!). Would there be anyone here be interested in helping to create a community-developed Hanzi Odyssey series? We'd have to recruit some native speakers to help out, of course, and I know a few. But there's a lot of work that we could do too in structuring it. I think it'd be a great way to both study and give back to the community at the same time. Interest in developing Hanzi.Odyssey.3501? - Nukemarine - 2009-05-31 The process seems simple enough, it just has to be done in steps. Step 1 - Organize Hanzi into thematic groups of five. Early on this should be simple as it'll mesh with KO2001, but later Hanzi will take some thinking. The Hanzi should be 3000 in number or about that, right? Step 2 - Gather most frequently used words. I'm sure a list like this is laying around. Another option is use a site scanning program that can tabulate word frequency. Step 3 - Organize the frequency words in Step 2 using hanzi groups from step 1. Someone earlier said they made a program that organized sentences based on Kanji, it's not a stretch to make a program that organizes Chinese words based on the Hanzi inside them. Words with lesser used Hanzi goes toward the end, even if it has a very common Hanzi inside. Step 4 - Groom the list for obvious errors and to balance words per kanji. Since step 3 is a automatic process, step 4 is important to remove problems such as 1 hanzi getting 20 words and another getting 3. Step 5 - Utilize smart.fm's chinese core. Hope that Cerego releases an intermediate and advanced list for Chinese. Interest in developing Hanzi.Odyssey.3501? - mafried - 2009-05-31 Actually the first 50 or so hanzi will likely be quite different, since there's some very common and important grammatical hanzi that are relatively rare in Japanese as native hiragana words/particles are used instead (I'm thinking of 是=です and 的=の of course, but there are many more). But the KO2001 ordering will certainly be a useful reference. Step 1 anyone who has finished RTK/RTH can do. There's a lot of frequency lists that can be used as a basis for your Steps 2, 3, and 4. I'm considering writing a script that will compare these frequency lists and select or dozen or so of the most common compounds. I would then like to have a native speaker narrow it down to the really important ones, like CosCom did, and maybe add in some ones that really should have been included but slipped through the frequency counts. As for the example sentences, if we got this far I was hoping to pay a creative Chinese student at my university (or one of my gf's friends) to come up with some good i+1 example sentences. I could even record audio. But smart.fm would be a good backup choice. My biggest worry in doing all this is that mistakes will creep in at any level, simply because we're all still learners of the language. I could write scripts to do most of it myself, but I'm not confident the results will be worth learning from. If, on the other hand, this were a community driven project, there'd be a lot of eyes looking at it and we should be alright. EDIT: Nukemarine, are you learning Chinese? I can't remember. Interest in developing Hanzi.Odyssey.3501? - Jarvik7 - 2009-05-31 A word of warning about word frequency: consider what the frequency list was generated from. The common Japanese one that people use was generated from a financial newspaper and is thus not at all representative of typical Japanese. This is why iKnow's official word lists are so bizarre. You learn political and financial 専門用語 before really common stuff like "mom" or "hand". Failing the pre-existence of something like this, there is probably a parser out there that can generate a frequency list given text input. Feeding the entire contents of the Chinese wikipedia (this can be downloaded in a big zip afaik) into it would produce something better than financial garbage. Interest in developing Hanzi.Odyssey.3501? - mafried - 2009-05-31 Yeah, that's why I want to include native speakers in the process. The frequency list is a tool, as frequency and importance are often correlated. But they are certainly not the same. EDIT: the other tool to use is standard vocabulary lists like the ones you find in textbooks. I've found that lists based on frequency often don't include a small number of items from a list of common vocabulary (the frequency list might net you 'table', 'chair', and 'bed', but leave out 'couch'. Why? who knows) Interest in developing Hanzi.Odyssey.3501? - Nukemarine - 2009-05-31 Jarvik, it's that reason that one would sort the frequency list into an intuitive order (the Hanzi.Odyssey.3500 order you could say). If such a thing were done with iKnow's list, you'll still have the 2000 words include 経済 and 手 and お母さん, however the word for economics will be shifted to the end due to using an infrequent kanji instead of being the 8th word you'll learn in Step 3. So you'll still have wonky words, but they'll at the end of the list. Jarvik, I would personally use newsgroups as this will be people talking about stuff they like using words they know. Another could be scanning TV and Movie scripts if such a thing is possible. No matter what's used, there will be odd words that get in there. Mafried, no, I'm not learning Chinese. I can't even grasp Japanese yet. Interest in developing Hanzi.Odyssey.3501? - Jarvik7 - 2009-05-31 A frequency list itself should be an intuitive order. The problem is when you generate that list from a source that isn't representative of typical language (as in a financial newspaper). Personally I've never used KO, but I'm assuming they used Hadamitzky & Spahn style ordering, where you only get compounds consisting of characters you already know. That's probably a better idea than pure frequency, but the frequency should still be respected in a general way. Wikipedia was just an example, but it is the easiest way to get a ton of general readership text all at once with minimal manual labor. It also covers a very broad range of topics (all of human knowledge if their charter was fulfilled), so while a lot of uncommon words will get in, the common ones would still be at the top. Movie scripts or newsgroups would also of course work, but you would have a much smaller dataset and thus less reliable statistics. Interest in developing Hanzi.Odyssey.3501? - mafried - 2009-05-31 No, there's nothing inherent about a making a frequency list that ensures an intuitive ordering. In fact it's usually quite the opposite. For example, the five most common English nouns are time, person, year, way, and day, in that order. Time, year, and day are thematic, for sure. But what are person and way doing there? I would group in week (#17) with the time nouns, and learn man (#7) and woman (#14) alongside person. And while we're at it, why is man so much higher on the list than woman? That's not intuitive. Frequency lists will be useful for getting a list of words to include, but a fair amount of manual tweaking will be required to make the learning process as efficient as possible. Thank you for the suggestions for making a frequency list though, that may come in handy. Interest in developing Hanzi.Odyssey.3501? - Nukemarine - 2009-05-31 Correct, it's doing both things. You group words thematically, which is kind of easier to do in Japanese and Chinese by using Kanji, but the words you're using are from a frequency list. The frequency means you get more bang for you buck (likely hood you'll see/use the word in the real world), while the ordering makes the learning more intuitive. Even after all that, there's the other problem that one word can have many different uses. To get over that hurdle is going to require native help. There's no way around that. Interest in developing Hanzi.Odyssey.3501? - mafried - 2009-05-31 Nukemarine Wrote:The Hanzi should be 3000 in number or about that, right?That's how many Heisig/Richardson decided to include in RTH... which was about as arbitrary as Harbaugh's 4400 in zhongwen.com. I just as arbitrarily decided to split the difference for now--the actual number may be more or may be less. Unlike with Japanese, there's no real hanzi-use standard, or a means of enforcing one. If you want to use a word that includes a rare hanzi.. you really have no choice but to use that hanzi or find a synonym that uses more common characters. In Japanese it's easier because words using rare kanji can be replaced with their kana equivalents. Interest in developing Hanzi.Odyssey.3501? - ghinzdra - 2009-05-31 I intend to take chinese in a near future (either by the end of this year or summer of next year) so I have a deep interest in this idea . I have two remarks - I don't see the point of thematic group/ intuitive order... KO2001 has been written this way true . And kudo to them .The more you have for your money the better . But do you have the slighest idea how hard it is to pull it off on 3000 kanjis ? especially when it's done by amateur which mean a large but irregular workforce ? endless argument about the best order ... on 3000 kanjis!!! I think THE master of the frequency list is tim ferris who learned 5 languages through this device (among them chinese) and I've never heard of single word of him about intuitive order.....It's just about getting a max of understanding for a minimum of learning and that's it. I agree it's a very nice feature though. But I don't think we can afford it . Let's aim for something within our reach. - it seems to me the most efficient way is to put together online hanzi frequency list & hardcover vocabulary frequency list for the hanzi frequency list http://lingua.mtsu.edu/chinese-computing/statistics/ seems to be an excellent basis . for the vocabulary frequency list a of them -A Frequency Dictionary of Mandarin Chinese Core Vocabulary for Learners Authors: Richard Xiao; Paul Rayson; Tony McEnery here is a sample : it looks great to me . You have the frequency number (even if I trust more the lingua list ) , a word , a sentence and its english translation .... http://media.routledgeweb.com/pdf/9780415455862/sample_mandarin.pdf -Title: 6000 Chinese WordsA Vocabulary Frequency Handbook ISBN: 957638527X Author: James Erwin Dew - the Chinese Language Corpus site at Academia Sinica has published a book called "6,000 Chinese Words" we would just have to follow the order of lingua.mtsu.edu and put an example taken out of the choosen book for each character . Simple and efficient . Besides I have some thought about cross referencing with RTK/RTH based on what David556 provided http://forum.koohii.com/viewtopic.php?id=2235 Interest in developing Hanzi.Odyssey.3501? - mafried - 2009-06-09 Thanks for the useful info, ghinzdra. The lists will be quite important when the work starts. Regarding your first point, I think the "benevolent dictator" model is what should be followed here. The final say on what order and which words to include would be given to a very open minded and unopinionated person. I'd be willing (but not anxious) to step up to the plate for this, although I can't start working on it until later this summer. Regarding the intuitive order... I can't offer anything other than anecdotal evidence of "it works for me!" I have trouble remembering words in isolation, but I'm not just talking about the context of a sentence. I can learn the word for "mother" in any language as easily as anyone else, but for only say 2-3x the effort I could also learn "father" and the words for "brother", "sister", "sibling", "cousin", "parent", "aunt", "uncle", etc. all at once. What's more, seeing these together builds associative connections that make the words easier to recall when they are actually needed. But maybe that's just me. All I'm suggesting is that we take the words of a frequency list and group them thematically. In any case, I can't start work on this until the end of July at the earliest. But it's definitely going to happen, and any help from forum members would be greatly appreciated. Interest in developing Hanzi.Odyssey.3501? - Jarvik7 - 2009-06-09 mafried Wrote:Unlike with Japanese, there's no real hanzi-use standard, or a means of enforcing one. If you want to use a word that includes a rare hanzi.. you really have no choice but to use that hanzi or find a synonym that uses more common characters.Actually they frequently use simpler characters for their phonetic values to represent a complex character too, much like Japanese ateji. Interest in developing Hanzi.Odyssey.3501? - Hashiriya - 2009-06-09 i couldn't imagine going through the process of learning yet another Asian language... i think i'll just stick with learning Japanese for now
Interest in developing Hanzi.Odyssey.3501? - Codexus - 2009-06-10 Mmmmm, I'm ashamed. I've just bought myself a subscription to the new chineseclass101 podcast from the guys at Innovative Language, I guess now I'll have to renew my chinesepod account that just expired two weeks ago too. Off course, I don't have the time to really learn Chinese but I just can't resist pretending I'm going to anyway v_____v Interest in developing Hanzi.Odyssey.3501? - zer0range - 2009-08-03 Any further development with this? I am willing to contribute some time in getting things read and critiqued by my native speaking teachers. Interest in developing Hanzi.Odyssey.3501? - mafried - 2009-08-04 It has remained stalled for the moment. I'm back in the country but there's a lot of things taking up my time now--new job, moving into a new apartment, etc. It's taking me longer than I thought to get settled. That, and what time I've had for studying is being taken up by the one hanzi, one picture thread, and more recently the smaller project of sentence mining ZhongWen Red / Green / Blue. So I'm not ready to make any promises on my end. But any help from others would be welcome and appreciated, so thank you. |