![]() |
|
Hanzi counter? - Printable Version +- kanji koohii FORUM (http://forum.koohii.com) +-- Forum: Learning Chinese (http://forum.koohii.com/forum-17.html) +--- Forum: Chinese and Hanzi (http://forum.koohii.com/forum-20.html) +--- Thread: Hanzi counter? (/thread-13190.html) |
Hanzi counter? - deathtrap - 2010-03-02 I'm looking for an application or script that can count the number of unique hanzi in a given text, the Anki plugin for this doesn't seem to work, atleast not for me. Anyone know of any? Hanzi counter? - deathtrap - 2010-03-03 Nobody?
Hanzi counter? - Evil_Dragon - 2010-03-03 http://lingua.mtsu.edu/chinese-computing/vp/index.php?CNTEXT_Session=a7e6820f6530305a13701485187ede20 Maybe this might be of help for you. Hanzi counter? - HerrPetersen - 2010-03-03 What you ask for can be found here: http://www.chinese-forums.com/showthread.php?t=28000 (check out HedgePigs excel-sheet) Hanzi counter? - ファブリス - 2010-03-03 If that's any help, regular expressions can check for unicode blocks and unicode scripts (see bottom of page). You could filter out all unwanted characters with one simple regexp, then split a string with another simple regexp, then remove all doubles by assigning characters as keys into a hash or something like that. I actually didn't know about these until last year.. and had been using hexadecimal code points in the RevTK code for a long while (>_>) Hanzi counter? - vorpal - 2010-04-25 Do you just want the number? This could be done in Python in one line of code. If you need it done, let me know. (I can also get you the list of characters in one line, if need be.) Hanzi counter? - unauthorized - 2010-04-27 vorpal Wrote:Do you just want the number? This could be done in Python in one line of code. If you need it done, let me know. (I can also get you the list of characters in one line, if need be.)Well, it could be done in one line of C, but that doesn't mean you should stack up all your code on the same line. Weird formatting was cute the first time someone did it on IOCCC, but not anymore. There are multiple solutions above that can do what OP wanted. He would have no doubt figured out or given up by now. Hanzi counter? - deathtrap - 2010-04-27 Thanks for your responses guys. As a programmer myself I could build it quite easily but I was looking to save myself the time since if there was one already made then I didn't have to divert time from studying. HerrPetersons link had the right solution, there was an excel file posted that would let people input a list of characters match against and it would give the unique characters in and not in that list. Quite handy. I should probably create a web based version of it for anyone that needs it. Thanks for the help guys. Hanzi counter? - FooSoft - 2010-04-27 Can use or tweak my perl scripts, I did this exact same thing for counting kanji in Wikipedia! http://foosoft.net/japanese/kanji_frequency/ Hanzi counter? - Vaste - 2010-04-29 for the record, in python it is this difficult: Code: print len(set(u'日本語今後コンゴ日本語'))excluding from a list: Code: sometext=u"""Hanzi counter? - FooSoft - 2010-04-30 You can just iterate over every character in the file, checking it's unicode code block right? All the characters are nicely organized so you just have to know the ranges. |