Tobberoth
Member
From: Sweden
Registered: 2008-08-25
Posts: 3364
With the release of mentat_kgs awesome ruby script for ripping tbs news, I thought I'd create a simple script myself.
Most people here read tons of Japanese stuff, and we all know how boring it is to have a dictionary by your side and constantly look words up. Some people get around this by simply reading and then noting each new word in a list. This list can then later be used for mining.
What this script does is take a simple list of Japanese words, search dic.yahoo.co.jp for definitions and example sentences then put together a file with this information. It works with both Daijisen and Daijirin.
Like mentats script, you need ruby and hpricot to use it.
http://rapidshare.com/files/237794913/grabdef.zip
It's very easy to use. Simply put the script in some folder and put a textfile with a list of words in the same folder. Use the terminal to run the command, supplying input and outout as arguments. Example:
In a folder you have grabdef.rb. You save input.txt in the same folder, it contains this:
言葉
翌日
探す
You use the terminal and run "ruby grabdef.rb input.txt output.txt". After it's completed, you'll have a new file, output.txt which contains something like
言葉
1 日本語の定義はここぞ.
etc.
I've only tried the script on Linux, hopefully it works on windows and OS X as well. The input file must be saved as UTF-8. One does not have to supply the output filename, if it's not there, the script will automatically save the definitions as "output.txt" overwriting any such file in the folder.
Final note, it searches daijisen by default. if you want to use daijirin instead, open grabdef.rb in a text editor and change
DNAME = "&dname=#{DAIJISEN}"
to
DNAME = "&dname=#{DAIJIRIN}"
I know there are probably websites and maybe plugins to Anki which do more or less the same thing, but this saves me time at least. Let me know if you find it useful. Please let me know if you find any glitches and bugs as well, especially if you know Ruby and have suggestions
. I know the output isn't very pretty atm, I might rework the script to create prettier output files later.
EDIT: Has edited the script to work on Windows and added a simple Readme.
Last edited by Tobberoth (2009 May 27, 8:40 am)
Tobberoth
Member
From: Sweden
Registered: 2008-08-25
Posts: 3364
While I agree it might have been easier to use as a plugin to Anki, we should remember that not everyone uses Anki. A plugin for Anki would only work there, this script works everywhere.
As for windows, you should really have no problem with my script since it doesn't actually work with any commands, it's all pure ruby. The problems that can come up are two-fold:
1. The input needs to be UTF-8, notepad kinda sucks there. As long as you have a decent text editor (e text editor, notepad++ or SciTE for example) it shouldn't be a problem.
2. You need ruby-gems installed and then, using that, install hpricot. Shouldn't be any harder to install on windows than on linux, but I haven't tried it myself.
Last edited by Tobberoth (2009 May 26, 4:04 pm)
Tobberoth
Member
From: Sweden
Registered: 2008-08-25
Posts: 3364
ahibba wrote:
Thank you for explanation.
1. I don't have problems with UTF-8. I have Win32Pad, EditPad Pro and other advanced text editors.
2. This is why I asked if someone has tried it. Because I don't have these things. I know that Ruby is a programming language, but I never used it. I'll search for it and hpricot.
Don't worry about the installation, Ruby is very easy in this regard. First, go to www.ruby-lang.org and get the Ruby one-click installer. As you hear from the name, it's really easy to install.
Once installed, if you know how, you should probably add the PATH to the ruby executable to your path (if the one-click installer doesn't do that for you, which is possible). This does so that when you're in command-line mode (Start -> Run and type cmd) you can simply write ruby rubyscript.rb to run scripts. Anyways, to get ruby-gems, just go here and download the latest .zip:
http://rubyforge.org/frs/?group_id=126
Unpack the zip, go into that directory using cmd and run "ruby setup.rb". When done, simply run "gem install hpricot" from the cmd and it should automatically download and install hpricot for you.
Once done, you can run my script just fine. mentats script will work as well (but you need to get mencoder for that one to work, instructions in his thread.)
Last edited by Tobberoth (2009 May 26, 5:26 pm)