Back

Epwing2Anki - Tool For Automatically Generating Anki Vocabulary Cards

#1
Hello,

Epwing2Anki may be used to automatically or semi-automatically create Japanese Anki vocabulary cards based on a provided list of words and one or more of your favorite EPWING dictionaries and/or the included EDICT J-E dictionary and Tatoeba example sentence corpus. Epwing2Anki is for people like me who hate making vocabulary cards by hand.

Download the latest version of Epwing2Anki via SourceForge

You will need Windows (XP/Vista/7) and .Net Framework v3.5 installed.

To demonstrate, here are some screenshots of the dialogs in the order that you would typically see them:

[Image: Welcome_v1.0.png]


[Image: Setup_Dictionaries_v1.0.png]
In the above screenshot you can see that I've added 2 J-E dictionaries (『研究社 新和英大辞典 第5版』 and EDICT) and 2 J-J dictionaries (『大辞林 第2版』 and 『広辞苑第六版』). 『研究社 新和英大辞典 第5版』 is at a higher priority then EDICT, so when looking up words, Epwing2Anki will first look at 『研究社 新和英大辞典 第5版』. If a word is not found in 『研究社 新和英大辞典 第5版』, Epwing2Anki will try to look up the word in EDICT and so on.


[Image: Setup_Examples_v1.0.png]
You can see that I only want to take example sentences from 『研究社 新和英大辞典 第5版』 and Tatoeba. I do this by disabling all of the other example dictionaries.


[Image: Setup_Card_Layout_v1.1.png]
Here I've told Epwing2Anki that my cards should have an expression, reading, a definition from the highest priority J-E dictionary, a definition from the highest priority J-J dictionary and example sentences.


[Image: Setup_IO_v1.2.png]
Here I've elected to take a semi-automated approach: I want to manually disambiguate when there are multiple entries for the same word but I want to let Epwing2Anki pick the best example sentences.


[Image: FineTune_v1.6.png]
This dialog appears when the "fine-tune options" link in the previous screenshot is clicked. It allows you to select a number of options to further customize your cards.


[Image: Progress_v1.0.png]
This dialog will appear when the user presses the "Create Anki Import File!" button. The next two dialogs below will appear as necessary for the current word being processed.


[Image: Disambiguate_v1.0.png]
If you see this dialog it means that multiple entries for the same word have been found in the same dictionary. You need to select which entry to use in order to continue. You may either click the links or use the 1-9 number keys.


[Image: Choose_Examples_v1.0.png]
If you decide to uncheck the "Automatically choose up to n example sentences" option on the "Setup Inputs and Outputs" page, this page would show you all the example sentences for the current word from each enabled example dictionary. You may choose however many you like to go into your card.


[Image: Results_v1.0.png]
You will see this dialog when Epwing2Anki finishes creating the Anki import file. This dialog will inform you if anything went wrong such as when a word is not found in any of the dictionaries and Epwing2Anki could not create a card for it. You can also open the directory that contains the Anki import file from this dialog.


At this point you can take the Anki import file that was generated and... import it into Anki.

When you exit the program, all of your settings will automatically be saved.

Here is an example showing a card generated from『研究社 新和英大辞典 第5版』:
[Image: Example1.png]

Have Fun!
cb4960
Edited: 2012-08-18, 11:58 am
Reply
#2
Currently Supported Dictionaries:
●『研究社 新和英大辞典 第5版』(Kenkyusha Shin Waei Daijiten 5th Edition [J-E])
●『研究社 新英和・和英中辞典』(Kenkyusha Shin Eiwa-Waei Chujiten [J-E])
●『大辞林 第2版』/『三省堂 スーパー大辞林』(Daijirin 2nd Edition [J-J])
●『広辞苑第六版』(Kojien 6th Edition [J-J])
●『大辞泉』(Daijisen [J-J])
●『明鏡国語辞典』(Meikyo Kokugojiten [J-J])
● EDICT/Tatoeba (Included in Epwing2Anki by default. EDICT is used for readings and definitions, and Totoeba is used for example sentence.).

Future Plans:
● Support 『ジーニアス英和・和英辞典』.
● Support 『デイリーコンサイス英和辞典 第5版』.
● Add step-by-step instructions on how to import the Anki import file into Anki.
● Maybe add the ability to generate cards for words from arbitrary Japanese text (using Mecab to parse the text).
● Maybe add some sort of ability to generate a card on-the-fly for a word on the clipboard by pressing a global hotkey. The entry can either be can appended to an Anki import file or sent to Anki direct via Anki's Real-Time Import plugin.
● Maybe add support for online dictionaries.

Epwing2Anki Needs YOU!
Please let me know
● If you find a bug or typo. If it's a crash bug, it might help to send me the appropriate log in the Logs directory.
● If you think something about the interface is unclear or could be better. If you see something that makes you think "what the heck does that do?", let me know.
● If you have a feature suggestion. There is nothing too big or too small (no guarantee that I'll actually do it though).

Tips:
Rikaisama is an easy way to generate word lists while browsing the web.
Japanese Text Analysis Tool can be used to create word lists from arbitrary Japanese text that are sorted by frequency.
Edited: 2015-09-05, 4:24 pm
Reply
#3
Nice layout. I need to get my hands on some EPWING dictionaries though. Wish I could extract them from my electronic dictionary.

I think the only thing I might consider changing is allowing for dynamic fields but having formatting for the data (what you now have as fields). Its not that big of a deal though.
Reply
6-Month Challenge: Get 6-Month Premium for $66 or Premium PLUS for $166 (June 19th - 30th)
JapanesePod101
#4
I have no idea how to get EPWING dictionaries into the computer. >.> I tried googling, looking through RevTK wiki at the EPWING software section. Either that or I'm just reading things incorrectly.

I'm not exactly knowledgeable with these sort of things. I know EPWING dictionaries can be in handheld electronic dictionaries but that's about it.
Edited: 2012-07-17, 12:02 am
Reply
#5
Unless you are willing to fork over a ~$100 for the DVD that has the dictionary on it. The only real answer to getting an EPWING dictionary is through less scrupulous means, torrents, direct downloads, etc. You'll have to search for them on your own though.
Reply
#6
cb4960, you are a god and you invented the holy grail of Japanese learners! I was like "wtf!?!?" when I woke up in the morning and discovered this!! THANK YOU SO MUCH!!!!

[Edit]

Is there a way combining the japanese.txt from Rikai-sama with this plugin? That would be awesome. Basically, I would like this plugin only to "add" example sentences for each card automatically and a J-J definition from a random epwing dictionary to my japanese.txt file or so, so that I can import it into my usual anki vocabulary deck. Is that possible, cb4960? Big Grin
Edited: 2012-07-17, 1:55 am
Reply
#7
This plugin is pretty cool. It's similar to the idea I had for dictscrape (http://forum.koohii.com/showthread.php?tid=9652&page=1). (Thank you for linking to it by the way.)

When I get some time I also plan on adding epwing support for my plugin. Is it difficult getting the example sentences from epwing dictionaries like the 研究社 新和英大辞典 第5版? It is relatively difficult to get definitions and example sentences from the yahoo dictionaries because there are so many types of wild formatting. It's all using Web 1.0 style <table>'s in order to format the entries, not nice <div>s and <span>s.
Reply
#8
vix86 Wrote:Nice layout. I need to get my hands on some EPWING dictionaries though. Wish I could extract them from my electronic dictionary.

I think the only thing I might consider changing is allowing for dynamic fields but having formatting for the data (what you now have as fields). Its not that big of a deal though.
Could you possibly provide an example? Thanks.
Reply
#9
Tori-kun Wrote:Is there a way combining the japanese.txt from Rikai-sama with this plugin? That would be awesome. Basically, I would like this plugin only to "add" example sentences for each card automatically and a J-J definition from a random epwing dictionary to my japanese.txt file or so, so that I can import it into my usual anki vocabulary deck. Is that possible, cb4960? Big Grin
The easiest thing to do right now is save words to your "japanese.txt" using $d and then when you have enough words to make it worthwhile, run japanese.txt through Epwing2Anki. This is actually the way I do it. Using this method, I'm not distracted with card creation each time I come across a word I want to learn while reading.

(And just to clarify, Epwing2Anki is not a plugin, rather it is a stand-alone Windows program).
Reply
#10
partner55083777 Wrote:When I get some time I also plan on adding epwing support for my plugin. Is it difficult getting the example sentences from epwing dictionaries like the 研究社 新和英大辞典 第5版? It is relatively difficult to get definitions and example sentences from the yahoo dictionaries because there are so many types of wild formatting. It's all using Web 1.0 style <table>'s in order to format the entries, not nice <div>s and <span>s.
I image parsing EPWING dictionaries is a lot easier than parsing the web-based yahoo dictionaries. For 研究社 新和英大辞典 第5版, each example sentence is on a seperate line and always starts with one of two bullet characters.
Reply
#11
cb4960 Wrote:Could you possibly provide an example? Thanks.
So instead of setting up fields for each data point. Allow users to create their own fields dynamically and then add whatever and when they want to add the rest of the data they use something like %edict% or %kenkyuusha%. The program then replaces at those points for the data. Think like how Anki replaces for fields in a card template; that sort of thing.

Without using it much, its hard to say if it would really make a big difference or be useful. Its mostly a nitpick and just a wish to try and keep stuff closer to my current models instead of making a new one for a specific program. Honestly, I wouldn't bother adding it were I you, unless more people request it, there are better things to spend your time working on. I could probably just write a python script to rework the outputted data into something I can use anyway.

Don't worry about it.

Great work!
Reply
#12
vix86 Wrote:So instead of setting up fields for each data point. Allow users to create their own fields dynamically and then add whatever and when they want to add the rest of the data they use something like %edict% or %kenkyuusha%. The program then replaces at those points for the data. Think like how Anki replaces for fields in a card template; that sort of thing.
Ah. I do use the token approach for Rikaisama, but I think that the approach that I ended up using in Epwing2Anki is easier to setup, potentially less confusing, and probably powerful enough for most people when paired with Anki's own question/answer layout.
Reply
#13
I'm having a problem right now with getting it to recognize any of my dictionaries. I just get the generic "Can't find the title" in the CATALOGS file error. I'm using the right dictionaries, and they normally work... is it a .Net thing, a Win7-64 thing, or is it a Path thing?

Oh, and I'd really like to have tokens. I spent a couple of hours last night cleaning up output to re-import, and it's a royal pain. -_- Maybe under an "Advanced" tab?

Edit: Just double-checked. EB Pocket has no problems reading any of them, so I'm stumped.
Edited: 2012-07-17, 5:04 pm
Reply
#14
Here's the last log entry:

Quote:17:08:21.591: Epwing2Anki version: 1.0.0.0
17:08:21.602: Microsoft Windows NT 6.1.7601 Service Pack 1
17:08:21.602: FormMain_Load
17:09:14.251: ** Could not find title of EPWING dictionary!
17:09:15.711: Sorry, this EPWING dictionary is not supported yet.
17:09:41.851: ** Could not find title of EPWING dictionary!
17:09:42.717: Sorry, this EPWING dictionary is not supported yet.
17:10:24.472: ** Could not find title of EPWING dictionary!
17:10:25.835: Sorry, this EPWING dictionary is not supported yet.
17:11:06.103: ** Could not find title of EPWING dictionary!
17:11:07.590: Sorry, this EPWING dictionary is not supported yet.
17:12:20.356: ** Could not find title of EPWING dictionary!
17:12:21.490: Sorry, this EPWING dictionary is not supported yet.
17:12:38.131: The following EPWING dictionaries are supported:

●『研究社 新和英大辞典 第5版』
(Kenkyusha New Japanese-English Dictionary 5th Edition)

●『大辞林 第2版』
(Daijirin 2nd Edition [J-J])

●『三省堂 スーパー大辞林』
(Sanseidou Super Daijirin - it contains 『大辞林 第2版』)

●『広辞苑第六版』
(Kojien 6th Edition [J-J])
17:12:51.209: ** Could not find title of EPWING dictionary!
17:12:52.151: Sorry, this EPWING dictionary is not supported yet.
17:12:54.506: FormMain_FormClosing
Reply
#15
rich_f Wrote:I'm having a problem right now with getting it to recognize any of my dictionaries. I just get the generic "Can't find the title" in the CATALOGS file error. I'm using the right dictionaries, and they normally work... is it a .Net thing, a Win7-64 thing, or is it a Path thing?

Oh, and I'd really like to have tokens. I spent a couple of hours last night cleaning up output to re-import, and it's a royal pain. -_- Maybe under an "Advanced" tab?

Edit: Just double-checked. EB Pocket has no problems reading any of them, so I'm stumped.
Do you have any non-ASCII characters in the file path of the CATALOGS file? If so, you will need to remove them. If that doesn’t solve the issue, let me know. Also if you put Epwing2Anki in the Program Files directory, try moving it somewhere else. By default Windows restricts write privilege to folders in Program Files. Epwing2Anki needs to write to its own local directories, so that can be a problem.

“Tokens” and Epwing2Anki’s “Fields” are synonymous.

For example, say that in some other program you have these tokens:
%e = expression
%r = reading
%d = definition from EDICT
spaces = tab

If you wanted an expression, reading, and definition in your import file you would specify the tokens like this:
“%e %r %d”

To get the equivalent import file in Epwing2Anki, you would add fields to the Card Layout listbox in the Setup Card Layout page like this:
Expression
Reading
Definition: EDICT

So in other words:
%e = “Expression”
%r = “Reading”
%d = “Definition: EDICT”

When you go to generate the import file with Epwing2Anki, the import file will look something like this:

expression_for_word1<tab>reading_for_word1<tab>definition_for_word1
expression_for_word2<tab>reading_for_word2<tab>definition_for_word2
etc.

The above import file has 3 fields/columns.

For line 1, these are the fields/columns:
Field1: expression_for_word1
Field2: reading_for_word1
Field3: definition_for_word1

For line 2, these are the fields/columns:
Field1: expression_for_word2
Field2: reading_for_word2
Field3: definition_for_word2

In the Anki import dialog, you would just map whatever you call your Anki expression field to Field 1 of the import file, you would map whatever you call your Anki reading field to Field 2 of the import file, and you would map whatever you call your Anki definition field to Field 3 of the import file.

Anki doesn’t force you to map each field. If you just want to use just the expression and definition fields from the import file (for whatever reason), you can leave Field 2 in the Anki import dialog blank.

The Anki import dialog also has a neat ability to update certain fields in existing cards using fields from the import file. So for example, say you only have cards with 3 fields: expression, reading and definition. Now say you added a fourth “example sentences” field. For all your existing cards, the example sentences fields would now be blank. Using Epwing2Anki you can create an import file containing example sentences for the words that are already in your deck. Then you can tell Anki to update only the example sentences field for those cards from your import file.

Clear as mud?
Reply
#16
Okay, I changed the directory names, now I get "MSVCR711.dll is missing," or just a repeat of the previous errors. I tried moving it from C:\ to Downloads, because Downloads has plenty of write permissions, but no dice.

The other stuff is clear... ish. I suppose I'll understand it better when I go through the process.

EDIT: And yeah, I have .Net installed. Hrm.
Edited: 2012-07-17, 6:33 pm
Reply
#17
rich_f Wrote:Okay, I changed the directory names, now I get "MSVCR711.dll is missing," or just a repeat of the previous errors. I tried moving it from C:\ to Downloads, because Downloads has plenty of write permissions, but no dice.

The other stuff is clear... ish. I suppose I'll understand it better when I go through the process.

EDIT: And yeah, I have .Net installed. Hrm.
It looks like I need to distribute MSVCR71.dll (Microsoft Visual C Runtime library). Please download it from here:
http://www.mediafire.com/?xigsrmnv7ybt0sn

And then place it in Epwing2Anki's Utils\eplkup directory. If it works, I'll release a new version with this dll.
Edited: 2012-07-17, 7:25 pm
Reply
#18
Did the trick!

I'm already testing DictScrape as well, so I'll try to test both apps as I can. (I have a linux box running DictScrape in one room, and my main PC in the other...)

Just ran the test file... holy crap that's fast! o_o

Okay, looking at the output in Excel... I set it for 7 example sentences, and it crammed them all in one cell. Same goes for the readings. Is that how it's supposed to work? How do I take that output and turn it into 7 individual entries in a sentences deck?

Also, is there a way to lop off the English translations and put them in a separate field?

Generally, my cards are like this:

Front: Sentence full of Japanese.
Back: reading field + meaning field

reading is the sentence in kana;
meaning is whatever I put in "notes," which is where I would dump an EN translation, and any definitions.
Edited: 2012-07-17, 8:09 pm
Reply
#19
rich_f Wrote:Did the trick!
Great! Thanks for being my beta tester.

rich_f Wrote:Okay, looking at the output in Excel... I set it for 7 example sentences, and it crammed them all in one cell. Same goes for the readings. Is that how it's supposed to work? How do I take that output and turn it into 7 individual entries in a sentences deck?
Yes, that how it supposed to work for now. The emphasis is on vocabulary cards rather than sentence cards. This may change in the future. One way to break the sentences into separate fields is to open the import file that Epwing2Anki creates in Notepad++ and replace "▲" with "\t▲" (make sure that the "Extended" search mode is selected). [Edit: Now that I think about it, this would be difficult to import into Anki because Anki expects a consistent number of tabs in the import file and not words will have 7 sentences.]

rich_f Wrote:Also, is there a way to lop off the English translations and put them in a separate field?
No, but that's a good idea. You can, however, use the "Example Sentences (without translation)" field to add only the Japanese part.
Edited: 2012-07-17, 8:27 pm
Reply
#20
Yeah, the \t solution would require splitting the results into two files: one for definitions, and one for example sentences. Not impossible, just 面倒くさい。 (Maybe a radio button option or something? Like Sentence Priority vs. Vocab Priority? Sentence priority would do the stuff below that I like, Vocab priority would work like it works now.)

I could lop off the EN translation, but I'm used to having the translation to check to make sure I read the sentence right (both for pronunciation and for meaning), and I'm sure beginners would love to have those translations as well.

I've never really done well with just vocab cards. Not enough context for me.

One of the things dictscrape does that I really like is that it creates a combined definition field-- headword, kana, accent, and definition. If there was a way to tab out all of those example sentences, and just tag them with the translation field and the "definition," that would be perfect.

I just want a way to combine the best of the 研究社 dictionary with the best of the Yahoo dictionaries in my sentences, and I don't want to spend a lot of time doing it. The Holy Grail of card-making. Big Grin
Reply
#21
The output sentences are tweak-able with regex. The trickier ones are the words that don't end in periods. I have to dig up the old regexp for searching for JP words to fix that one.
Reply
#22
rich_f Wrote:Yeah, the \t solution would require splitting the results into two files: one for definitions, and one for example sentences. Not impossible, just 面倒くさい。 (Maybe a radio button option or something? Like Sentence Priority vs. Vocab Priority? Sentence priority would do the stuff below that I like, Vocab priority would work like it works now.)
The implementation should be pretty simple. I just need to add an "example cards" checkbox or something like you said and then make one card for each example sentence. So if 7 examples were found for a word, Epwing2Anki would put 7 lines in the import file, one for each example. And the other fields (expression, reading, etc.) would, of course, be repeated. I can probably get it in this weekend.

rich_f Wrote:I've never really done well with just vocab cards. Not enough context for me.
That's what subs2srs is for Smile
Edited: 2012-07-17, 9:32 pm
Reply
#23
Okay, thanks. That'll save me a lot of time, and some sanity. I tried to figure out how to handle regex searches for fragments without periods that weren't tab-delimited... and an hour went by before I knew it. I got close-ish, but the last 30% were putting up a tough fight.

I'll wait for the next version. Big Grin
Reply
#24
Hello,

I have just released version 1.0.1 of Epwing2Anki.

Download Epwing2Anki v1.0.1 via SourceForge

What Changed?

This is just a quick release to add msvcr71.dll (Microsoft Visual C runtime library) to the distribution so that that my EPWING lookup tool will function properly on computers that don't already have this dll installed.

cb4960
Reply
#25
@cb4960: Thanks for your reply! I tried as you said, however...

My japanese.txt file has the following format:
Expression | Reading | Meaning (in German) | Audio

Now, how can I preserve exactly this data and just "add" the sentence/definition from a J-J and a J-E dictionary to this already existing file or another new .tsv file, that still has the information (i.e. German meaning and [sound:$a.mp3] audio path) saved from the original japanese.txt file?
Right now, I have to manually add the definition and sentence by manually copy and pasting from 研究者/明鏡 dictionary, which is a bit troublesome.

Perhaps you could add the field "Audio" and "Saved Meaning" to Epwing2Anki?
Reply