![]() |
|
Script to replace Kanji with Heisig Keywords - Printable Version +- kanji koohii FORUM (http://forum.koohii.com) +-- Forum: Learning Japanese (http://forum.koohii.com/forum-4.html) +--- Forum: Learning resources (http://forum.koohii.com/forum-9.html) +--- Thread: Script to replace Kanji with Heisig Keywords (/thread-3204.html) |
Script to replace Kanji with Heisig Keywords - lifeflaw - 2009-06-02 Hello, As I study Japanese words and sentences in a flashcard program, I decided to add a field on the answer side to show Heisig's keywords. Example: Question: 最低 Answer: reading: さいてい, meaning: at least, etc. Heisig Keywords: [utmost][lower] I am looking for a script that can generate the "Heisig Keywords" field automatically. So the script can be for a flashcard program, spreadsheet, etc. I already tried to search the Internet but couldn't find any such utility... Does anyone know of such a ready script? Thanks in advance. Script to replace Kanji with Heisig Keywords - jreaves - 2009-06-02 I add the Heisig keywords to most of my flashcards. I have a Python script that can be run from the command line, as well as a web version. You can try out the web version here: http://www.reavesmd.com/cgi-bin/heisig.cgi I apologize in advance for the user interface, which is essentially nonexistent. ![]() If you're comfortable with the command line, I'll be glad to share the other version of the script, which can be useful for adding a Heisig column to spreadsheets before importing them into Anki. What would be even better would be an Anki plugin to automatically generate a Heisig keyword field, similar to the way the reading generation works.... Script to replace Kanji with Heisig Keywords - lifeflaw - 2009-06-02 Thank you so much for your reply. jreaves Wrote:If you're comfortable with the command line, I'll be glad to share the other version of the script, which can be useful for adding a Heisig columns to spreadsheets before importing them into Anki.Yes, a command line version would be perfect for my needs; that would be a tremendous time saving for me. I really appreciate it. jreaves Wrote:What would be even better would be an Anki plugin to automatically generate a Heisig keyword field, similar to the way the reading generation works....Yes, I had the same thought. I guess that others don't care so much about including the keywords in their cards. Script to replace Kanji with Heisig Keywords - jreaves - 2009-06-02 Here's the script. You'll need Python installed on your system. Paste the script into a text file, edit the path for the Heisig data file, make the script executable, and give it a run. I believe I got the Heisig file through the Anki website. I may have downloaded the Heisig Anki database and exported to get the text file. Anyway, if you have trouble with that, e-mail me and I'll send you a copy. It's too big to paste here. #!/usr/bin/python # Usage: Search column 3 of inputfile for kanji. Rewrite each line in inputfile to the screen with an additional Heisig field at the end. # inputfile is assumed to be tab-separated and UTF-8 encoded. # # ./addheisig.py inputfile 2 > outputfile # Column numbering starts with 0 # # Same as above, but format the output with just the keywords, not the kanji. # # ./addheisig.py inputfile 2 -k > outputfile # import sys, codecs RTK = 'heisig-data-2.txt' # Change this path/file name for your system file = codecs.open(sys.argv[1], 'r', encoding='utf-8') col = sys.argv[2] # Column in the input file where we look for kanji keywords_only = False if len(sys.argv) > 3 and sys.argv[3] == '-k': keywords_only = True for line in file: searched = [] heisig = [] parts = line.split('\t') for ch in parts[int(col)]: if ord(ch) >= 0x4E00 and ord(ch) <= 0x9FBF and ch not in searched: searched.append(ch) rtk = codecs.open(RTK, 'r', encoding='utf-8') for entry in rtk: rtkparts = entry.split('\t') if rtkparts[0] == ch: if keywords_only: heisig.append(rtkparts[1]) else: heisig.append(rtkparts[0] + '-' + rtkparts[1]) break rtk.close() print line.strip().encode('utf-8'), print '\t'.encode('utf-8'), if len(heisig) != 0: # Oddly, if these lines are combined into one, the output cannot be redirected to a file. h = ', '.join(heisig) print h.encode('utf-8'), else: print " ".encode('utf-8'), print '' file.close() sys.exit(0) Script to replace Kanji with Heisig Keywords - lifeflaw - 2009-06-03 It is working perfectly. Thank you so much for your kind help.
|