![]() |
|
Script to grab japanese definitions for wordlists - Printable Version +- kanji koohii FORUM (http://forum.koohii.com) +-- Forum: Learning Japanese (http://forum.koohii.com/forum-4.html) +--- Forum: Learning resources (http://forum.koohii.com/forum-9.html) +--- Thread: Script to grab japanese definitions for wordlists (/thread-3144.html) |
Script to grab japanese definitions for wordlists - Tobberoth - 2009-05-26 With the release of mentat_kgs awesome ruby script for ripping tbs news, I thought I'd create a simple script myself. Most people here read tons of Japanese stuff, and we all know how boring it is to have a dictionary by your side and constantly look words up. Some people get around this by simply reading and then noting each new word in a list. This list can then later be used for mining. What this script does is take a simple list of Japanese words, search dic.yahoo.co.jp for definitions and example sentences then put together a file with this information. It works with both Daijisen and Daijirin. Like mentats script, you need ruby and hpricot to use it. http://rapidshare.com/files/237794913/grabdef.zip It's very easy to use. Simply put the script in some folder and put a textfile with a list of words in the same folder. Use the terminal to run the command, supplying input and outout as arguments. Example: In a folder you have grabdef.rb. You save input.txt in the same folder, it contains this: 言葉 翌日 探す You use the terminal and run "ruby grabdef.rb input.txt output.txt". After it's completed, you'll have a new file, output.txt which contains something like 言葉 1 日本語の定義はここぞ. etc. I've only tried the script on Linux, hopefully it works on windows and OS X as well. The input file must be saved as UTF-8. One does not have to supply the output filename, if it's not there, the script will automatically save the definitions as "output.txt" overwriting any such file in the folder. Final note, it searches daijisen by default. if you want to use daijirin instead, open grabdef.rb in a text editor and change DNAME = "&dname=#{DAIJISEN}" to DNAME = "&dname=#{DAIJIRIN}" I know there are probably websites and maybe plugins to Anki which do more or less the same thing, but this saves me time at least. Let me know if you find it useful. Please let me know if you find any glitches and bugs as well, especially if you know Ruby and have suggestions . I know the output isn't very pretty atm, I might rework the script to create prettier output files later.EDIT: Has edited the script to work on Windows and added a simple Readme. Script to grab japanese definitions for wordlists - nac_est - 2009-05-26 It works! Thanks a lot, it's another great idea. I'm thinking about how to use it in a smart way. It could be very useful for the times when I want to put a definition on a card. But I input one card at a time, so it wouldn't be efficient that way. I could instead type all the sentences on a text document beforehand, then locate all the words I want to look up, then look them up with your nice script and finally enter them quickly into Anki. How do you do it? Script to grab japanese definitions for wordlists - ahibba - 2009-05-26 That's great. Anyone tried it (and the other script of mentat_kgs) on Windows? It would be more helpful if it was an Anki plugin. Script to grab japanese definitions for wordlists - Tobberoth - 2009-05-26 While I agree it might have been easier to use as a plugin to Anki, we should remember that not everyone uses Anki. A plugin for Anki would only work there, this script works everywhere. As for windows, you should really have no problem with my script since it doesn't actually work with any commands, it's all pure ruby. The problems that can come up are two-fold: 1. The input needs to be UTF-8, notepad kinda sucks there. As long as you have a decent text editor (e text editor, notepad++ or SciTE for example) it shouldn't be a problem. 2. You need ruby-gems installed and then, using that, install hpricot. Shouldn't be any harder to install on windows than on linux, but I haven't tried it myself. Script to grab japanese definitions for wordlists - ahibba - 2009-05-26 Thank you for explanation. 1. I don't have problems with UTF-8. I have Win32Pad, EditPad Pro and other advanced text editors. 2. This is why I asked if someone has tried it. Because I don't have these things. I know that Ruby is a programming language, but I never used it. I'll search for it and hpricot. Script to grab japanese definitions for wordlists - Tobberoth - 2009-05-26 ahibba Wrote:Thank you for explanation.Don't worry about the installation, Ruby is very easy in this regard. First, go to http://www.ruby-lang.org and get the Ruby one-click installer. As you hear from the name, it's really easy to install. Once installed, if you know how, you should probably add the PATH to the ruby executable to your path (if the one-click installer doesn't do that for you, which is possible). This does so that when you're in command-line mode (Start -> Run and type cmd) you can simply write ruby rubyscript.rb to run scripts. Anyways, to get ruby-gems, just go here and download the latest .zip: http://rubyforge.org/frs/?group_id=126 Unpack the zip, go into that directory using cmd and run "ruby setup.rb". When done, simply run "gem install hpricot" from the cmd and it should automatically download and install hpricot for you. Once done, you can run my script just fine. mentats script will work as well (but you need to get mencoder for that one to work, instructions in his thread.) Script to grab japanese definitions for wordlists - superdry - 2009-05-27 not working for me on windows. I'm only getting jibberish in the output file Script to grab japanese definitions for wordlists - mentat_kgs - 2009-05-27 To work on windows you need to change the line 55 from File.open(save_file, 'w') do |file| to File.open(save_file, 'wb') do |file| Script to grab japanese definitions for wordlists - cangy - 2009-05-27 nice idea. I was thinking of doing something similar for studying sentences, feeding them into http://www.csse.monash.edu.au/~jwb/cgi-bin/wwwjdic.cgi?9T Script to grab japanese definitions for wordlists - Tobberoth - 2009-05-27 mentat_kgs Wrote:To work on windows you need to change the line 55Ah, I'll be darned, it has to write the kanji as binary? Well, I'll fix it and upload a new version right away. |