kanji koohii FORUM
Nayr's Core5000 deck (Frequency Dictionary of Japanese) - Printable Version

+- kanji koohii FORUM (http://forum.koohii.com)
+-- Forum: Learning Japanese (http://forum.koohii.com/forum-4.html)
+--- Forum: Learning resources (http://forum.koohii.com/forum-9.html)
+--- Thread: Nayr's Core5000 deck (Frequency Dictionary of Japanese) (/thread-12092.html)

Pages: 1 2 3 4 5 6 7 8 9


Nayr's Core5000 deck (Frequency Dictionary of Japanese) - Nayr182 - 2014-08-19

Nayr's Core5000 is no longer supported.

An existing copy can be found here: https://www.dropbox.com/s/srgy6alqsqb52dg/Core5000_v2.5.apkg?dl=0

To follow my new project, please visit http://www.unlockjapanese.com


Nayr's Core5000 deck (Frequency Dictionary of Japanese) - Rotasu - 2014-08-19

IMO,

Would find this deck more useful if the word was highlighted or bold on the front side. Also dont like much the romaji on the back side next to the word. Would have been nice if it was in kana, maybe with a line break before and after it.

I dont know about everyone else, but I like to see 'noun' rather than 'n.' or 'particle' rather than just 'p.'. Looking through some of the cards, some of the shorten words, I dont even know. Like, 'cp.' and 'adn.'

Interesting deck to look through once I'm done with learning all my vocab from my current deck.


Nayr's Core5000 deck (Frequency Dictionary of Japanese) - Nayr182 - 2014-08-20

Thanks for your reply, I agree about the romaji. Unfortunately that's how it is in the book, and this deck is a verbatim copy of the content within that book.

Ill make a list of all the shorten forms to help people understand what they mean as it will take too long to manually change them all.


Thanks for the feedback.


Nayr's Core5000 deck (Frequency Dictionary of Japanese) - jasberg - 2014-08-20

Thanks for the deck. It looks like it has a lot of quality content. It would be amazing if it had native audio for words and sentences.


Nayr's Core5000 deck (Frequency Dictionary of Japanese) - Nayr182 - 2014-08-20

Abbreviations used in the deck:

adn. - adnominal
adv. - adverb
aux. - auxiliary
conj. - conjunction
cp. - compound
i-adj. - i-adjective
interj. - interjection
n. - noun
na-adj. - na-adjective
num. - numeral
p. - particle
p. case - case particle
p. conj. - conjunctive particle
p. disc. - discourse particle
pron. - pronoun
v. - verb


Nayr's Core5000 deck (Frequency Dictionary of Japanese) - Stansfield123 - 2014-08-20

Nayr182 Wrote:Thanks for your reply, I agree about the romaji. Unfortunately that's how it is in the book, and this deck is a verbatim copy of the content within that book.

Ill make a list of all the shorten forms to help people understand what they mean as it will take too long to manually change them all.
It's fine that the kana, romaji, and notes are there, the only issue is that they're in the same field.

But it's nothing to worry about, I'm sure someone will fix that once the full deck is up. All it takes is a little programming to separate the first word from the rest of the text, in that field.


Nayr's Core5000 deck (Frequency Dictionary of Japanese) - aldebrn - 2014-08-20

Nayr182 Wrote:**Have started the process of adding native Japanese voice to all cards. Will update deck when completed.**
Curious about how you swung this. Kidnapped a busload of Japanese tourists? Tongue


Nayr's Core5000 deck (Frequency Dictionary of Japanese) - john555 - 2014-08-20

Nayr182 Wrote:Thanks for your reply, I agree about the romaji. Unfortunately that's how it is in the book, and this deck is a verbatim copy of the content within that book.

Ill make a list of all the shorten forms to help people understand what they mean as it will take too long to manually change them all.


Thanks for the feedback.
Some of us though (e.g., me) actually like seeing romaji at the same time they are learning vocabulary.

The way I'm studying the new vocabulary for each lesson in my textbook is as follows:

1. I look at the english word.

2. I write the english word in Japanese, using kanji plus, where applicable, okurigana. As I write the word, I say it out loud.

3. I check my vocalization against the romaji version to make sure I got it right.


Nayr's Core5000 deck (Frequency Dictionary of Japanese) - Nayr182 - 2014-08-20

aldebrn Wrote:
Nayr182 Wrote:**Have started the process of adding native Japanese voice to all cards. Will update deck when completed.**
Curious about how you swung this. Kidnapped a busload of Japanese tourists? Tongue
My wife is Japanese, just have to suck up a bit more than usual :p


Nayr's Core5000 deck (Frequency Dictionary of Japanese) - RawToast - 2014-08-21

Nayr182 Wrote:Another deck I have made in the past is a Genki 1 and 2 deck with native audio.
My beginner Genki deck can be found here: https://ankiweb.net/shared/info/3894365375
Bit of a hijack.

I remember trying that Genki deck, the audio was great -- but the vocab seemed to be changed. Perhaps these are the differences between edition 1 and 2 of the book or your wife provided more natural sentences? This review summed it up quite well:

Quote:Already within chapter 1 I've noticed some odd things:

< snip -- some name grumble I don't care about >

I can understand replacing basic level Kanji, e.g. さかな to 魚, for grammar points changingもらう to 貰う is just going to cause confusion. Not only is the kana form more commonly used, but you won't see 貰う until you reach N1 level materials.

せんもん (専門) (specialty) has been replaced with a rarer word not found in Genki - I believe 専業, I've already changed this in mine. I find this odd as if you used this deck after studying per chapter you would have no clue. The kanji for せんもん hasn't even been shown in the book!
And back to the great resource you've provided Smile It may be possible to use a regular expression to remove the roumaji or add a tab/comma so that it's moved to another field.


Nayr's Core5000 deck (Frequency Dictionary of Japanese) - Nayr182 - 2014-08-21

RawToast Wrote:
Nayr182 Wrote:Another deck I have made in the past is a Genki 1 and 2 deck with native audio.
My beginner Genki deck can be found here: https://ankiweb.net/shared/info/3894365375
Bit of a hijack.

I remember trying that Genki deck, the audio was great -- but the vocab seemed to be changed. Perhaps these are the differences between edition 1 and 2 of the book or your wife provided more natural sentences? This review summed it up quite well:

Quote:Already within chapter 1 I've noticed some odd things:

< snip -- some name grumble I don't care about >

I can understand replacing basic level Kanji, e.g. さかな to 魚, for grammar points changingもらう to 貰う is just going to cause confusion. Not only is the kana form more commonly used, but you won't see 貰う until you reach N1 level materials.

せんもん (専門) (specialty) has been replaced with a rarer word not found in Genki - I believe 専業, I've already changed this in mine. I find this odd as if you used this deck after studying per chapter you would have no clue. The kanji for せんもん hasn't even been shown in the book!
And back to the great resource you've provided Smile It may be possible to use a regular expression to remove the roumaji or add a tab/comma so that it's moved to another field.
Yeah I have read those comments.

He is correct I used a lot of kanji that isn't contained in the Genki 1 and 2 books, but I did that on purpose as I wanted to study the sentences in their fully kanji-fied natural form.

Back when I made the deck (12 months or so ago) I may have accidentally written a kanji-fied version of a word that is commonly seen in furigana. But f it was anything too strange my wife would have most likely picked it up and now I can read both kanji and furigana versions so I don't really see it as a problem.

Genki 1 and 2 only introduces 317 kanji, I think my Genki decks introduced 750-ish from memory. (Has been a while since I last looked at the deck).

I went from a total beginner, did nothing but learn hiragana, katakana and RTK and was able to get though the deck, sure it was challenging at times but nothing too bad.

I used the 2nd edition of Genki 1 and 2 which uses 専攻 instead of 専門, and I can only assume there are other differences between versions.


Nayr's Core5000 deck (Frequency Dictionary of Japanese) - RawToast - 2014-08-21

Quote:He is correct I used a lot of kanji that isn't contained in the Genki 1 and 2 books, but I did that on purpose as I wanted to study the sentences in their fully kanji-fied natural form.
Coming from RTK many people opt for that approach Smile The IMEs will do that if your not careful too..

Look forward to the new deck, the audio in the Genki deck was brilliant. I still have it on my phone for random listening Smile


Nayr's Core5000 deck (Frequency Dictionary of Japanese) - aldebrn - 2014-08-21

RawToast Wrote:And back to the great resource you've provided Smile It may be possible to use a regular expression to remove the roumaji or add a tab/comma so that it's moved to another field.
I threw together a web-browser-based display for APKG files because I wanted a closer look at this deck: http://fasiha.github.io/fuzzy-anki/

(It's "web-browser-based", not "web-based" since it's all client-side Javascript, you're not uploading an APKG file to anyone except yourself, so you could use this to view personal decks in APKG files.)

It'll actually detect this Core5000 deck and make some changes to it: it'll replace "[kana readings]" and put them in "span" HTML tags that can then be styled with CSS. I've temporarily chosen to just make them smaller. (If you upload the APKG file you downloaded from Ankiweb.net to the app, you'll see what I mean.) I'm working (on the side) at a regular expression to deal with the roumaji in a similar way (put it in span tags so people can choose to hide them entirely, or make them small, or surround them with brackets, whatever) but it's a bit complicated because you have multiple words, e.g.,
Quote:人 hito n. person, people, human being
若い wakai i-adj. young
and sometimes there are multiple roumaji separated by commas before the part-of-speech tag. I'll work something out. The changes the app makes can't yet be exported as a modified APKG file (the tool is about twelve hours old), but I will probably add that. Also useful might be a manual, non-programmatic way to edit the contents one-at-a-time like in Anki's horrible browser, that might come later too.

But all that is boring technical preface to this: I like this deck! I fully appreciate the shortcomings of frequency-based approaches, thanks to erlog, who said back in 2009,
erlog Wrote:A kanji's frequency of appearing has little do with how important it is for understanding. In fact, you could make the case that frequency and importance have an inverse relationship. The less frequent kanji are probably more important because they are only used when they are necessary. The same goes for words.
...
How important is the word Monday in this sentence: Monday, my father died. -or- Yesterday, I started choking while eating an apple.

Those mundane words are the most common like Monday, my, father, apple, and eating. The less common words are where the true meaning of the sentence lies.
But nonetheless, this is useful for me. The sentences are funny Smile
Quote:ケーキとかチョコレートばかり 食たべるから、 君きみは 太ふとるんだ。 You gain weight, because you always eat cake and chocolate.



Nayr's Core5000 deck (Frequency Dictionary of Japanese) - aldebrn - 2014-08-21

Nayr182, is there supposed to be a "<" in this one?: ございます (<ござる) gozai masu v. [very polite form of “de aru”]

Also, #889 has this number in the "Word" field: 889 体験(する) taiken(suru) n. experience v. experience, have experience of


Nayr's Core5000 deck (Frequency Dictionary of Japanese) - Nayr182 - 2014-08-21

aldebrn Wrote:Nayr182, is there supposed to be a "<" in this one?: ございます (<ござる) gozai masu v. [very polite form of “de aru”]

Also, #889 has this number in the "Word" field: 889 体験(する) taiken(suru) n. experience v. experience, have experience of
Thank, I'll have a look tonight and make sure to fix it up when I release the next update (which will include voice).
Have added 100 voices so far, so shouldn't be too far away.


Nayr's Core5000 deck (Frequency Dictionary of Japanese) - aldebrn - 2014-08-22

There's three reasons why the "Word" field was problematic:
1- "人 hito n. person, people, human being" and "若い wakai i-adj. young" on the same field
2- "いろいろ iroiro adv., na-adj. various" having two parts of speech
3- "余り amari adv. the rest n. (not) much" and many others like this, with more than one part-of-speech/translation pairs.

But I think I got it working where each of the four pieces of information in this Word field got tagged according to what it was (kana/kanji, roumaji, part-of-speech, or translation). And the techniques used should hopefully work when you update the deck. For now, and for this deck in particular, those four sub-fields are just colored differently by Fuzzy-Anki to easily verify that the algorithm worked. Try uploading the APKG (as it exists on Ankiweb.net) to http://fasiha.github.io/fuzzy-anki/

Personally I'd like to see this tagging of both (1) kana in the Reading field, and (2) the different parts of the Word field, with "span" HTML tags in the final deck, leaving users free to style them however they want: hide roumaji, expand part-of-speech abbreviations, etc. You can even make the deck's default styling be so it exactly mimics what it is currently (kana surrounded by [] in Reading), so it'll still look the same to you. If you agree and don't want to do this tagging yourself, I can figure out how to export the data as an APKG or CSV, before or after your next update.

Thanks for the deck and the fun programming mini-project.


Nayr's Core5000 deck (Frequency Dictionary of Japanese) - Stansfield123 - 2014-08-22

aldebrn Wrote:There's three reasons why the "Word" field was problematic:
1- "人 hito n. person, people, human being" and "若い wakai i-adj. young" on the same field
2- "いろいろ iroiro adv., na-adj. various" having two parts of speech
3- "余り amari adv. the rest n. (not) much" and many others like this, with more than one part-of-speech/translation pairs.

But I think I got it working where each of the four pieces of information in this Word field got tagged according to what it was (kana/kanji, roumaji, part-of-speech, or translation). And the techniques used should hopefully work when you update the deck. For now, and for this deck in particular, those four sub-fields are just colored differently by Fuzzy-Anki to easily verify that the algorithm worked. Try uploading the APKG (as it exists on Ankiweb.net) to http://fasiha.github.io/fuzzy-anki/

Personally I'd like to see this tagging of both (1) kana in the Reading field, and (2) the different parts of the Word field, with "span" HTML tags in the final deck, leaving users free to style them however they want: hide roumaji, expand part-of-speech abbreviations, etc. You can even make the deck's default styling be so it exactly mimics what it is currently (kana surrounded by [] in Reading), so it'll still look the same to you. If you agree and don't want to do this tagging yourself, I can figure out how to export the data as an APKG or CSV, before or after your next update.

Thanks for the deck and the fun programming mini-project.
Ideally, they should be moved into separate fields rather than just tagged. That's how Anki is meant to work, you shouldn't have to use html to organize content.

Plus, tags are not as flexible as fields. You can display them differently, or even make them disappear, but you can't move them.


Nayr's Core5000 deck (Frequency Dictionary of Japanese) - aldebrn - 2014-08-22

Stansfield123 Wrote:Ideally, they should be moved into separate fields rather than just tagged. That's how Anki is meant to work, you shouldn't have to use html to organize content.
I forgot to address that. Yes, the word and roumaji can, and should be, put in separate fields, but what about the parts-of-speech and translations, of which there can be one or two, as in "adv. the rest n. (not) much"? I'd hate to see "POS1, Translation1, POS2, Translation2", especially since in the next 4000 he might want to put a third pair... Those two, I fear, may have to stay together in one field.

Stansfield123 Wrote:Plus, tags are not as flexible as fields. You can display them differently, or even make them disappear, but you can't move them.
Let me introduce you to my little friends removeChild and appendChild Big Grin but you're probably right, nobody wants to see a JavaScript-scarred template in Anki, or have to learn Javascript just to switch the order of two fields. (I personally like Javascript in Anki templates since it means I don't ever have to open the awful Anki card browser/editor: I edit the facts in a deck from a nicely formatted-for-humans Markdown file which is converted to JSON and served by a local webserver to Anki's browser. Don't ask for this to work on Ankiweb or mobile---though that might be a very nice project...)


Nayr's Core5000 deck (Frequency Dictionary of Japanese) - Nayr182 - 2014-08-22

aldebrn Wrote:There's three reasons why the "Word" field was problematic:
1- "人 hito n. person, people, human being" and "若い wakai i-adj. young" on the same field
2- "いろいろ iroiro adv., na-adj. various" having two parts of speech
3- "余り amari adv. the rest n. (not) much" and many others like this, with more than one part-of-speech/translation pairs.

But I think I got it working where each of the four pieces of information in this Word field got tagged according to what it was (kana/kanji, roumaji, part-of-speech, or translation). And the techniques used should hopefully work when you update the deck. For now, and for this deck in particular, those four sub-fields are just colored differently by Fuzzy-Anki to easily verify that the algorithm worked. Try uploading the APKG (as it exists on Ankiweb.net) to http://fasiha.github.io/fuzzy-anki/

Personally I'd like to see this tagging of both (1) kana in the Reading field, and (2) the different parts of the Word field, with "span" HTML tags in the final deck, leaving users free to style them however they want: hide roumaji, expand part-of-speech abbreviations, etc. You can even make the deck's default styling be so it exactly mimics what it is currently (kana surrounded by [] in Reading), so it'll still look the same to you. If you agree and don't want to do this tagging yourself, I can figure out how to export the data as an APKG or CSV, before or after your next update.

Thanks for the deck and the fun programming mini-project.
Thanks for showing so much interest in my deck, and for going to the effort of making a program to sort out the fields etc.

For now though I am just focusing my energy on making the deck and getting it out there before I start changing it around. Once I have the full 5000 cards loaded with voice please feel free to change, optimise, alter or do anything you like.

I personally don't mind the current layout. When reviewing I only really glance at the answer side for a split second, so the romaji whilst not idea, doest really bother me.


Nayr's Core5000 deck (Frequency Dictionary of Japanese) - Stansfield123 - 2014-08-22

aldebrn Wrote:I forgot to address that. Yes, the word and roumaji can, and should be, put in separate fields, but what about the parts-of-speech and translations, of which there can be one or two, as in "adv. the rest n. (not) much"? I'd hate to see "POS1, Translation1, POS2, Translation2", especially since in the next 4000 he might want to put a third pair... Those two, I fear, may have to stay together in one field.
Yes, of course, that makes sense. And you're right about the Javascript, I forgot you can do that with Anki. But still...I'm a programmer, and I'm yet to use script tags in my Anki template. It's probably best if people can just download a deck with separate fields.


Nayr's Core5000 deck (Frequency Dictionary of Japanese) - murtada - 2014-08-22

Thanks a ton for this deck


Nayr's Core5000 deck (Frequency Dictionary of Japanese) - Nayr182 - 2014-08-22

deleted


Nayr's Core5000 deck (Frequency Dictionary of Japanese) - Nayr182 - 2014-08-26

aldebrn Wrote:I fully appreciate the shortcomings of frequency-based approaches, thanks to erlog, who said back in 2009,
erlog Wrote:A kanji's frequency of appearing has little do with how important it is for understanding. In fact, you could make the case that frequency and importance have an inverse relationship. The less frequent kanji are probably more important because they are only used when they are necessary. The same goes for words.
...
How important is the word Monday in this sentence: Monday, my father died. -or- Yesterday, I started choking while eating an apple.

Those mundane words are the most common like Monday, my, father, apple, and eating. The less common words are where the true meaning of the sentence lies.
There is some truth in in Erlogs words. However I believe all words hold the same amount of importance in a language.

Whilst its true "Monday, my, father, apple, and eating" aren't the most 'important' words of the sentence, if one doesn't know their meaning then thats 80% of the sentence you don't know.

If I had a choice between knowing all of the common words that are used 80% of the time and only having to research the meaning of 1 'important' word; or knowing the important words, but not the remaining 80% of the sentence I know what I would choose.


Nayr's Core5000 deck (Frequency Dictionary of Japanese) - Termy - 2014-08-29

Is there any easy way to mirror all the cards into audio-first cards as well, and also mix them up so it's like 1 sentence-first card and then 1 audio card (not the same sentence on them though), or perhaps 3 of each in a row or something like that?


Nayr's Core5000 deck (Frequency Dictionary of Japanese) - Stansfield123 - 2014-08-29

aldebrn Wrote:But all that is boring technical preface to this: I like this deck! I fully appreciate the shortcomings of frequency-based approaches, thanks to erlog, who said back in 2009,
erlog Wrote:A kanji's frequency of appearing has little do with how important it is for understanding. In fact, you could make the case that frequency and importance have an inverse relationship. The less frequent kanji are probably more important because they are only used when they are necessary. The same goes for words.
...
How important is the word Monday in this sentence: Monday, my father died. -or- Yesterday, I started choking while eating an apple.

Those mundane words are the most common like Monday, my, father, apple, and eating. The less common words are where the true meaning of the sentence lies.
That's nice. However, the fact remains (fact backed by some rather basic math) that, if one wishes to rely on immersion (comprehensible input, or even output) to some extent, rather than purely studying, the optimal way to study a language is to study the most frequent and mundane morphemes first. That is how one will ensure that increasingly complex native materials are comprehensible as quickly as possible.

Furthermore, a person is most likely to achieve full understanding of a sentence like "Monday, my father died." (or any other sentence not involving a specialized field) by studying morphemes in order of frequency. Again, fairly basic math. Yes, that person will learn the word "died" last out of the four, but it will likely still be faster than a person studying in some other order.