With the release of mentat_kgs awesome ruby script for ripping tbs news, I thought I'd create a simple script myself.
Most people here read tons of Japanese stuff, and we all know how boring it is to have a dictionary by your side and constantly look words up. Some people get around this by simply reading and then noting each new word in a list. This list can then later be used for mining.
What this script does is take a simple list of Japanese words, search dic.yahoo.co.jp for definitions and example sentences then put together a file with this information. It works with both Daijisen and Daijirin.
Like mentats script, you need ruby and hpricot to use it.
http://rapidshare.com/files/237794913/grabdef.zip
It's very easy to use. Simply put the script in some folder and put a textfile with a list of words in the same folder. Use the terminal to run the command, supplying input and outout as arguments. Example:
In a folder you have grabdef.rb. You save input.txt in the same folder, it contains this:
言葉
翌日
探す
You use the terminal and run "ruby grabdef.rb input.txt output.txt". After it's completed, you'll have a new file, output.txt which contains something like
言葉
1 日本語の定義はここぞ.
etc.
I've only tried the script on Linux, hopefully it works on windows and OS X as well. The input file must be saved as UTF-8. One does not have to supply the output filename, if it's not there, the script will automatically save the definitions as "output.txt" overwriting any such file in the folder.
Final note, it searches daijisen by default. if you want to use daijirin instead, open grabdef.rb in a text editor and change
DNAME = "&dname=#{DAIJISEN}"
to
DNAME = "&dname=#{DAIJIRIN}"
I know there are probably websites and maybe plugins to Anki which do more or less the same thing, but this saves me time at least. Let me know if you find it useful. Please let me know if you find any glitches and bugs as well, especially if you know Ruby and have suggestions
. I know the output isn't very pretty atm, I might rework the script to create prettier output files later.
EDIT: Has edited the script to work on Windows and added a simple Readme.
Most people here read tons of Japanese stuff, and we all know how boring it is to have a dictionary by your side and constantly look words up. Some people get around this by simply reading and then noting each new word in a list. This list can then later be used for mining.
What this script does is take a simple list of Japanese words, search dic.yahoo.co.jp for definitions and example sentences then put together a file with this information. It works with both Daijisen and Daijirin.
Like mentats script, you need ruby and hpricot to use it.
http://rapidshare.com/files/237794913/grabdef.zip
It's very easy to use. Simply put the script in some folder and put a textfile with a list of words in the same folder. Use the terminal to run the command, supplying input and outout as arguments. Example:
In a folder you have grabdef.rb. You save input.txt in the same folder, it contains this:
言葉
翌日
探す
You use the terminal and run "ruby grabdef.rb input.txt output.txt". After it's completed, you'll have a new file, output.txt which contains something like
言葉
1 日本語の定義はここぞ.
etc.
I've only tried the script on Linux, hopefully it works on windows and OS X as well. The input file must be saved as UTF-8. One does not have to supply the output filename, if it's not there, the script will automatically save the definitions as "output.txt" overwriting any such file in the folder.
Final note, it searches daijisen by default. if you want to use daijirin instead, open grabdef.rb in a text editor and change
DNAME = "&dname=#{DAIJISEN}"
to
DNAME = "&dname=#{DAIJIRIN}"
I know there are probably websites and maybe plugins to Anki which do more or less the same thing, but this saves me time at least. Let me know if you find it useful. Please let me know if you find any glitches and bugs as well, especially if you know Ruby and have suggestions
. I know the output isn't very pretty atm, I might rework the script to create prettier output files later.EDIT: Has edited the script to work on Windows and added a simple Readme.
Edited: 2009-05-27, 8:40 am
