kanji koohii FORUM
Large vocab deck with tags for Genki, IJ, Tobira, JLPT plus homophones - Printable Version

+- kanji koohii FORUM (http://forum.koohii.com)
+-- Forum: Learning Japanese (http://forum.koohii.com/forum-4.html)
+--- Forum: Learning resources (http://forum.koohii.com/forum-9.html)
+--- Thread: Large vocab deck with tags for Genki, IJ, Tobira, JLPT plus homophones (/thread-7046.html)



Large vocab deck with tags for Genki, IJ, Tobira, JLPT plus homophones - rachels - 2011-01-11

Large vocabulary deck for sharing -
The corePLUS deck - available at the anki download site - has been around for a while but I have recently given it a major rewrite and tidy-up. It has the core words and sentences, plus additional vocabulary (> 20,000 in total) taken from the the words listed as common by Jim Breen's Edict.

In addition there are:

Word lists (tags) for textbooks - Genki 1&2, IATIJ and Tobira, by chapter.
A more complete set of tags for JLPT 1 - 5 (new levels) 8000+
A field listing common homophones and their definitions
A field listing if the word is 'usually kana' as stated by Edict.
A field for transitive-intransitive verb pairs, with definitions.
Edict grammar – so you can tell what kind of る verb a kore word/other word is; noun,adj, polite, humble, honorific, abbreviation etc
Tags for a large part of the vocabulary from Kanji Odyssey
Tags numbering RTK2 words
The other stuff from the kore spreadsheets, sentences, definitions, sound links, fields for sorting by sentence kanji....
Sound file references in the format (kana) – (kanji).mp3 eg: りょかん- 旅館.mp3 for non-kore words.
Links to example sentences on the web - useful if plugins are not available, eg anki online....

Thanks to all the people (not me) who put lots of work into it – Edict dictionary files, core spreadsheets, sound files, audio download plugin, franki, online word lists and various anki decks from which word lists were adapted, etc etc etc. Especially the new version of anki which allows overwriting of fields .


Large vocab deck with tags for Genki, IJ, Tobira, JLPT plus homophones - animehunter123 - 2011-01-11

Thank you everyone for this wonderful deck! The new jlpt update is amazing!!!

One day, I hope that a kind soul can add a japanese meaning field for the sanseido or goo.ne.jp dictionary lookups.


Large vocab deck with tags for Genki, IJ, Tobira, JLPT plus homophones - Thora - 2011-01-11

rachels Wrote:Especially the new version of anki which allows overwriting of fields .
Thanks for letting me know! (I was never able to get franki working properly)
I'm sure many people will appreciate all the work you've done on this, rachels.


Large vocab deck with tags for Genki, IJ, Tobira, JLPT plus homophones - Tori-kun - 2011-01-11

Thanks a lot! Brilliant!


Large vocab deck with tags for Genki, IJ, Tobira, JLPT plus homophones - rachels - 2011-01-12

Thora Wrote:Thanks for letting me know! (I was never able to get franki working properly)
The overwrite fields functionality in Anki (1.2 - now out) is great. It works well if you are matching on a unique key. Franki can still be better though, if you wanted to match on a key that occurs several times in the deck. Also, it is worth noting that if you overwrite to the tags field, it does overwrite the pre-existing tags, not append.
I think that easily being able to overwrite fields in the decks, allows people to share information more easily - eg If someone had word lists for a different textbook to incorporate into this or any other deck, or more example sentences, or Japanese definitions, corrections of mistakes etc, it would be easy enough, now to post a deck with just those facts, or even just those 2 or 3 fields which others could download, export to a text field and re-import using the update option. With tags though - you might need to import to a temporary field and then create the tags.


Large vocab deck with tags for Genki, IJ, Tobira, JLPT plus homophones - Tori-kun - 2011-01-16

With regards and respect to the creator of this deck, but omitting the "to" infront of verbs is unbearable and can lead to misunderstandings I find! F.e. 見せる - show. It's "to show" [and not "a show" for instance!]. Hope we can work on that? Currently i'm at 400 new cards and added the "to" to all the verb(forms) occuring so far..

btw, i always though 明日 is あした and not あさ or am i wrong? then it's a mistake in the sentence. the audio says "asa" and "ashita" is written (although the furigana say "asa", too)


Large vocab deck with tags for Genki, IJ, Tobira, JLPT plus homophones - Asriel - 2011-01-16

@Tori-kun -- see if it's あす instead of あさ。"Asu" is a reading for 明日, although it's not as common as "ashita"
I've found that sometimes computerized kana-izers and things often give "Asu" insead of "Ashita"


Large vocab deck with tags for Genki, IJ, Tobira, JLPT plus homophones - Tori-kun - 2011-01-16

@Asriel -- yeah, it's "asa".. never come across it, but as long as it's not entirely wrong it does not matter, although.. "unnecessary" stuff in my head has to be undo kind of. It remains ashita here forever, I guess.

Just corrected about 200 verb forms (adding the "to") -.- And added "Kanji" tag. If anyone is interested in my new deck, send me an mail.


Large vocab deck with tags for Genki, IJ, Tobira, JLPT plus homophones - rachels - 2011-01-17

I had a look at the deck and it seems that the lack of a 'to' at the beginning of verb definitions seems to be mostly only in the words from the core spreadsheet. I think I might leave the definitions as they are, but as it is annoying you, you can do the following.

1. finding all godan verbs in the core words, by entering into the browser search box
"edictgrammarSadv5 tag:CORE"
Don't type in the "

2. select these words, and then tag or Mark the words

3. Repeat for ichidan verbs
"edictgrammarSadv1 tag:CORE"

omit tag:CORE if you want to work on the whole deck and not just core words.

4. Select all tagged words then do search and replace - with the drop-down box limit the operation to the Meaning field. Tick the regular expressions box. In the search field put "(.+)" always omitting the quotes. In the replacement field put "to \1"

5. Reapeat 4. with "^to to " in the search field and "to " in the replacement field, if you have, (especially if working on non-core verbs) ended up with any Meaning fields like "to to dawn". In fact, in the core verbs, this one - 明ける -should be the only problem one, so you can edit it directly and omit step 5., I think. I tried (very briefly) to do 4. and 5. in a 1 step search and replace, but failed.
I hope this helps.
It's not as complicated as it sounds, but of course do a backup first and use the edit undo function if you need to.
Alternatively, looking at the codes after the definitions like v5k for -ku verbs v5s for -su, etc will also let you know when you have a verb. http://www.csse.monash.edu.au/~jwb/wwwjdicinf.html#code_tag = list of codes, mostly obvious, that you might come across.

I'm not planning on frequently (or perhaps ever) updating the deck, but I will, in the next few days, repost one more time. I didn't properly identify which word in the verb pair field was the transitive or intransitive one, so I've added that info (thanks yudantaiteki for the useful spreadsheet). I've also changed the card template to get rid of a lot of the white space where fields are empty.

If there are any major systematic errors, though, please let me know. The will be some errors and duplications. I tried to avoid duplications like 便利 and 便利な, びっくり
and びっくりする, etc etc but some unwanted things have crept in. I'm just going to ignore them or correct as I go along, at least for now.


Large vocab deck with tags for Genki, IJ, Tobira, JLPT plus homophones - Tori-kun - 2011-01-17

@rachels -- Once you decide updating the deck, I would really like you to include the "to"-thing for the benefit of all. I'm rather busy now and not that good with computers in general, so.. sorry for being unable to fix it on my own.

Another thing i would like to mention/suggest for improvement is the addition of する at nouns being able to build a form (unlike in the english language), I don't mean 洗濯する (to do the laundry) or stuff like that.. unfortunately i cannot find these nouns, but it does not make sense in english combining them with "to do" (suru in Japanese).

Achja, and I restructered my card layout; it looks better and seems to be more effective. the less information you have to keep in your mind as a "straight" beginner, the better. I did not delete anything else, just switched it off and added the 'Kanji' template/model, which you can see below [Just a question. Heisig says you should always repeat like being displayed in voab.png, but.. if i want to write or say something in japanese, i need to know the words actively right? So would YOU recommend me doing 'Kanji' -> see kanji.png picture? Or do you do that actually? Thanks for your help and replies.]

http://www.imagebanana.com/code/vdyia7gx/vocab.PNG
http://www.imagebanana.com/code/rkfi83co/kanji.PNG


Large vocab deck with tags for Genki, IJ, Tobira, JLPT plus homophones - rachels - 2011-01-17

I guess I could edit the verb definitions as you request. At least it would be consistent with the the non-core verb definitions. I'll also tag the suru verbs for you, but not edit them in any way – they are used as nouns too, and I don't want them to appear twice when I can help it. They should already be marked vs-s or vs in the grammar field.
Use the deck any way you find useful. Your cards look fine and some sort of active production practise is important. I believe, as others here have said, that it is best to practice specific skills to get good at them – listening practice for improved listening skills, reading practice for improving your reading etc. People master them in different orders. Some have become fluent in the language before even learning kanji. Other people (on this forum) have said that they have gotten to the point where they can read most kanji words, without necessarily knowing what they mean. A balanced approach is probably best, but do what interests you most first. You can get to the same end by different paths.
The deck's got pretty much all the core spreadsheet info – so can be used the same way as any core deck – with the flexibility to do different things with it later. To do sentence reading based on kanji frequency you should download the 'reset creation times' plugin to re-sort the cards on the k2001-index field or rtklitek2001-index field .
For myself, I try to do most of my learning by listening to stuff and only reviewing the words in Anki afterwards. I use a different deck for kanji and for grammar and will get back into KO2001 for reading (perhaps), but I use this deck to review/practise listening recognition of words (increasing passive vocabulary more than active vocabulary). I just listen to the Japanese word and try and come up with the meaning(s) in English. I also listen to the sentence before looking at the translation for extra listening practice, but don't mark myself right or wrong on it. Only if I feel inclined to, do I look at the example sentences to get a better idea of how the word is used or look at the homophone field – perhaps if I didn't get the meaning on the card, but think I got a correct meaning for it – just to clarify the situation. I find this approach quite useful for listening practice. For me trying to learn several words with the same meaning at once tends to lead to confusion. I need to solidly learn the most important way of saying it first eg あした before あす. However, learning several meanings for the same japanese kana word – I like. Eg かいだん - stairs / conversation / ghost story. Without really trying, I tend to from a loose association or picture combining the elements which helps me remember all the words more easily than one alone. I would be interested to know what others think about this and whether the homophone field seems useful to them too.


Large vocab deck with tags for Genki, IJ, Tobira, JLPT plus homophones - Tori-kun - 2011-01-18

Talking about the homophone field it is actually quite irritating and confusing me, I must say honestly. I switched it off right now. My aim is just to get a broad amount of vocabulary (so the readings in fact, not the meaning of the kanji and its droppings, the so called Okurigana Tongue) into my head, in order to use them actively and understand them.
I think your idea of making stories with homophone readings can be really worthwhile and helpful, although i haven't tried for myself. Sounds reasonable and reliable!
Please let us know about the date you are updating the deck.. I wonder: I learnt like 500 new cards with this deck, how can I "update" this deck without losing the plan, which cards are due and which aren't, rachels? Would be a pity, if I had to go through them once again.

Edit: Ah, before I forget. Concerning the 'suru': there are also phrases in Japanese which are built with 'aru' or 'naru', whereas we use a "to do" phrase in english!
Just to list a few..
その問題に関する記事を読みました。 (this one is listed as expression with 'suru' included)
私は別に気になりません。 -- ?
もうお金が無くなりました。 -- ?

Whereas this one is rather clear ('naru' - to become): 彼は医者になりました。
Just for your acknowledge, thanks for the effords!


Large vocab deck with tags for Genki, IJ, Tobira, JLPT plus homophones - walruz - 2011-01-20

Just a question: Definitions seem to be from JDIC, but from where are the information in the "Meaning" field?
I'm wondering specifically because 一切 is listed as "not at all, not one bit" in the Meaning field, but "all/everything/without exception/the whole/entirely/absolutely", which seems a little paradoxical?


Large vocab deck with tags for Genki, IJ, Tobira, JLPT plus homophones - rachels - 2011-01-20

The Meanings field is from the Core spreadsheets, or sometimes the other sources eg lists from textbooks or JLPT, ie any of the sources apart from Edict. In this case it was from the core spreadsheets (appearing in a negative sentence).
Kenkyusha's New Japanese-English Dictionary - 一切…しない never do; do not do at all.

@Tori-kun The changes I mentioned will probably appear by the middle of next week. (Currently travelling).


Large vocab deck with tags for Genki, IJ, Tobira, JLPT plus homophones - rachels - 2011-01-26

@ Tori-kun
On how to update the deck. If you downloaded it between 12 - 26.1.11, it's basically the same deck with just one extra field called TransIntrans at the very bottom of the list, and minor changes to some Meaning fields. So you could...

1. backup
2. download the new deck and export it to a text file.
3. add the new field to your deck.
5. make sure the 'key' field is set to 'Prevent Duplicates'. If you haven't changed this in the card layout, fields section, it should already be done.
6. go to the import menu and select the above text file. Don't do import, choose overwrite (all your progress will be kept). Your fields should match exactly and you won't need to change them. For the field to use for matching - change it from field 1 = Expression to field 3 = key.
ie. Field in File : Field 3
Field in Deck : key

It will take a few hours so perhaps run it overnight.

I've deleted about 50 - 100 duplicates in the last few days. If you want to capture those changes too - add in steps 4. and 7.
4. delete all unseen cards. Select unseen cards by typing is:new in the browser.
7. Then repeat 6. but import, not overwrite as above.

Add the info in TransIntrans to your card template if you wish to.

-------------------------------------------------------------------------------------------------------------------
If you have a corePLUS deck from earlier - all the fields don't match and you'll just have to look through and select what you want. Or for a quick and messy approach - Note I haven't tested this..

1. backup
2. download the new deck.
4. delete all unseen cards from your deck (that you haven't created yourself). Select by typing is:new in the browser.
7. Then import anki deck, not text file. You'll have to use Expression as the matching field here as the key field does not exist in the old version of the deck. Expression isn't always unique though.

Perhaps I should test this. You might end up with duplicates if the models in your new and old decks don't match ?? Does anyone know?
-------------------------------------------------------------------------------------------------------------------
No more changes are coming - unless I notice some significant errors.


Large vocab deck with tags for Genki, IJ, Tobira, JLPT plus homophones - Teskal - 2011-01-27

How big is the File?

I see the size 26333.78, but anki in the 1.2 version downloaded already over 50000 and is not stopping...

I got error:
Traceback (most recent call last):
File "C:\cygwin\home\dae\Home\anki\win\build\pyi.win32\anki\outPYZ1.pyz/ankiqt.ui.getshared", line 205, in accept
File "C:\cygwin\home\dae\Home\anki\win\build\pyi.win32\anki\outPYZ1.pyz/ankiqt.ui.getshared", line 233, in handleFile
File "C:\cygwin\home\dae\Home\anki\win\build\pyi.win32\anki\outPYZ1.pyz/ntpath", line 108, in join
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe3 in position 1: ordinal not in range(128)


Large vocab deck with tags for Genki, IJ, Tobira, JLPT plus homophones - Tori-kun - 2011-01-27

@Teskal -- The anki deck is 212.450kb big. (media not counted for core f.e.!)


Large vocab deck with tags for Genki, IJ, Tobira, JLPT plus homophones - Valentina - 2011-01-27

I am sorry to say that I got the same error as Teskal. I thought it was just me.


Large vocab deck with tags for Genki, IJ, Tobira, JLPT plus homophones - Tori-kun - 2011-01-27

So you get the error while downloading? I hadn't had these problems in the early afternoon today.. Strange. Or are you talking about the converting?


Large vocab deck with tags for Genki, IJ, Tobira, JLPT plus homophones - rachels - 2011-01-28

I'll check the file and re-package/re-upload. Sorry if there has been some sort of error.


Large vocab deck with tags for Genki, IJ, Tobira, JLPT plus homophones - rachels - 2011-01-29

Checked the deck and it seems to have been OK.
However something went awry with the exporting a packaged deck for upload process. Some but not all of the sound files inadvertently got in there, but less than 2 MB, so that was only part of the problem.
Anyway the deck has been repackaged, re-uploaded and successfully downloaded. Let me know if there any further problems.
I upgraded from 1.2.4 to Anki 1.2.5 a few days ago, so see below. Basically the download of the deck will be bigger - 52555.94KB. The deck itself will be the same size.

"Changes in 1.2.5
* When exporting decks for the shared decks area, the decks are no longer stripped of their caching information. This makes them slightly bigger, but means that no lengthy unpacking step is necessary when they are downloading, which can be particularly painful on slower computers and mobile devices. "


Large vocab deck with tags for Genki, IJ, Tobira, JLPT plus homophones - rachels - 2011-05-10

Can someone let me know if this deck displays OK on the iphone/ipod. Also, is the ruby/furigana showing up OK there? Still ironing out some issues on the android platform...

edit - deck displays fine in Ankidroid now - since about late 2011


Large vocab deck with tags for Genki, IJ, Tobira, JLPT plus homophones - beans - 2011-05-29

Thank you rachels, I like this deck! However, I found some facts with 感 in the Expression field that don't make sense to me. Their Reading / Meaning field contain the following:

うん fortune,luck
しまった Damn it!
はい wear,put on (sword)

They are all non-Core facts. So probably some bug in the script for creating non-Core facts? Which would mean there might be some more incorrect facts in the deck?


Large vocab deck with tags for Genki, IJ, Tobira, JLPT plus homophones - rachels - 2011-05-29

It's a big deck and there are bound to be some mistakes. I couldn't possibly verify everything that I imported. For example, for some of the textbook words I imported, I tried to remove some of the particles like に, and ended up changing
難い (にくい) (aux-adj) difficult, hard, (P)
into くい
Did I fix that one in the latest version of the shared deck? I hope so.

For the words you mentioned - the tags indicate that they came from the jlpt site I used
http://www.tanos.co.uk/jlpt/
and looking at the word list from that site it seems that some of the words written in kana have been marked with (感) in one of the columns.
They made their way into the deck, since, as noted above, I'm afraid I haven't looked at every word. So thanks for pointing this out.
Considering how it happened, if you put this search string into the browser
Expression:感 tag:jlpt* -Expression:感_* -Expression:_*感 it should find the problem words
It looks like there might be only 4 affected words, the 3 you mentioned plus
ね (感) value,price,cost,worth,merit.
I'll try and fix them next time I have to post the deck. I'll also have another look at the data files I used. Please let me know if you find similar problems.