Back

Vocabulary building?

#76
I would say you could stop at 20k. 40k is insanely high and would be at the level of a well educated native. 30k everyone knows. Whoever said that a native English speaker probably only knows 12k was talking about a 10 year old.

Yeah, children are fluent. They are also children. They do not function like adults in the adult world and whilst there are many factors that attribute to that, language ability is certainly one of them. So, if you are happy with the language ability of a child, fine. Don't expect to be able to function like an adult with the language ability of one.
Reply
#77
You guys are probably all right about the speed being too slow. I just don't think I would even be ABLE to go faster.

I don't use premade decks, I make all my own cards. I have to lookup a word in my pocket kenkyusha and electronic kenkyusha and combine their definitions and then put that into my internal wiki, complete with example sentences. Then I gotta categorize those five into categories and stuff on my wiki based on their lexical category and JLPT level.. and sometimes a word will have the same reading as another one so I'll have to make a disambiguation page and stuff. This alone can take like 45 minutes sometimes for five.

Then I have to add the cards to Anki, front and back (for both Kanji and Kana readings, with meaning and the opposite reading always on the back), and then review. At this point my amount of cards is already up to like 200 or something if I've been actively adding five everyday. I get my sonomama DS dictionary and write the reverse side of every single card, so this takes me an hour sometimes. I always pick hard so that doesn't always help.

So optimally I'd be done in like two hours... that's fine I guess, but it doesn't always go so fast. I'll always get distracted by something, or end up reading a Wikipedia article for half an hour or some shit, or watch some Youtube video. Then it's already been 4 hours and I'm not even done yet. 35 is like 7x this, it's absolutely unrealistic for me. I would die.
Edited: 2011-05-17, 7:23 am
Reply
#78
That doesn't sound much different than my pace, even though my methods differ wildly. I use iKnow right now (again, I came back) because it seems to make them stick the best for me. I might learn 30 new words in a day, but then the next week is spent reviewing them and not learning other new ones.

As for making your own cards, I think that would be the only way that might stick as well as iKnow for me, but I just haven't got the patience.

In short, I'm saying: Don't fret it. Go your own pace and feel good about your accomplishments. Don't focus on what others are doing, or claim to be doing.
Reply
May 16 - 30 : Pretty Big Deal: Save 31% on all Premium Subscriptions! - Sign up here
JapanesePod101
#79
KMDES Wrote:The average English native probably only knows between 12k-20k anyway.
eh? I don't want to doubt you but I quote:

"The average speaker knows, as a low ball-estimate, about 60,000 words. I think the proper estimate is closer to 80,000..."

Source: Paul Bloom, Lecture 6 of the Yale "Intro Psych" lectures on ItunesU. (about 26 minutes into the lecture)

(Although it is a little bit vague, and he could be talking words in the sense of morphemes)
Edited: 2011-05-17, 7:47 am
Reply
#80
wccrawford Wrote:Don't focus on what others are doing, or claim to be doing.
I agree, but it is useful to try out the methods that people suggest.
Reply
#81
caivano Wrote:
wccrawford Wrote:Don't focus on what others are doing, or claim to be doing.
I agree, but it is useful to try out the methods that people suggest.
I'll agree with 'can be useful'. It can also be a major waste of time, distract you from your studies, and lead you down a spiraling path to despair.

It's important to find balance between doing what you know works, and experimenting with new things.
Reply
#82
Coreth Wrote:You guys are probably all right about the speed being too slow. I just don't think I would even be ABLE to go faster.

I don't use premade decks, I make all my own cards. I have to lookup a word in my pocket kenkyusha and electronic kenkyusha and combine their definitions and then put that into my internal wiki, complete with example sentences. Then I gotta categorize those five into categories and stuff on my wiki based on their lexical category and JLPT level.. and sometimes a word will have the same reading as another one so I'll have to make a disambiguation page and stuff. This alone can take like 45 minutes sometimes for five.

Then I have to add the cards to Anki, front and back (for both Kanji and Kana readings, with meaning and the opposite reading always on the back), and then review. At this point my amount of cards is already up to like 200 or something if I've been actively adding five everyday. I get my sonomama DS dictionary and write the reverse side of every single card, so this takes me an hour sometimes. I always pick hard so that doesn't always help.

So optimally I'd be done in like two hours... that's fine I guess, but it doesn't always go so fast. I'll always get distracted by something, or end up reading a Wikipedia article for half an hour or some shit, or watch some Youtube video. Then it's already been 4 hours and I'm not even done yet. 35 is like 7x this, it's absolutely unrealistic for me. I would die.
Coreth buddy, all that complicated stuff you're doing is largely unnecessary; it's no wonder it's taking you so long. Certainly go at your own pace but i'd advise against giving yourself all that extra work. I almost never bother SRSing now, but when I did, I could easily add 50-100 or more a day depending on how many words I jotted down that day. It would take maybe 10-20 minutes to create the cards and then I would just review them in 5min sessions when ever I felt like it. Card creation time: a couple of seconds. Card review time: a couple of seconds. Just enter the words you collect into an empty text field such as here and use rikaichan's 's' feature to save them to a text file for import into anki. You'll make more progress rapidly reviewing 100 words in simple J->E fashion even if you forget half of them (which in itself is no big deal because you didn't spend much time making them) than trying to nail down every nuance of 5 words. The more words you have loaded into your short-termish memory in this way, the greater the chance of hearing/reading them in the wild, shifting them further into you long-term memory through real use, and the better you'll get at understanding what you're reading and listening to. Not to mention noting down large amounts of words and kanji throughout the day kicks arse for your writing skills.
Reply
#83
nadiatims Wrote:Coreth buddy, all that complicated stuff you're doing is largely unnecessary; it's no wonder it's taking you so long. Certainly go at your own pace but i'd advise against giving yourself all that extra work. I almost never bother SRSing now, but when I did, I could easily add 50-100 or more a day depending on how many words I jotted down that day. It would take maybe 10-20 minutes to create the cards and then I would just review them in 5min sessions when ever I felt like it. Card creation time: a couple of seconds. Card review time: a couple of seconds. Just enter the words you collect into an empty text field such as here and use rikaichan's 's' feature to save them to a text file for import into anki. You'll make more progress rapidly reviewing 100 words in simple J->E fashion even if you forget half of them (which in itself is no big deal because you didn't spend much time making them) than trying to nail down every nuance of 5 words. The more words you have loaded into your short-termish memory in this way, the greater the chance of hearing/reading them in the wild, shifting them further into you long-term memory through real use, and the better you'll get at understanding what you're reading and listening to. Not to mention noting down large amounts of words and kanji throughout the day kicks arse for your writing skills.
Yeah well I have some serious problems with OCD. I can't deal with stuff if I don't make a wiki for it.
wccrawford Wrote:That doesn't sound much different than my pace, even though my methods differ wildly. I use iKnow right now (again, I came back) because it seems to make them stick the best for me. I might learn 30 new words in a day, but then the next week is spent reviewing them and not learning other new ones.

As for making your own cards, I think that would be the only way that might stick as well as iKnow for me, but I just haven't got the patience.

In short, I'm saying: Don't fret it. Go your own pace and feel good about your accomplishments. Don't focus on what others are doing, or claim to be doing.
Oh right, I actually came on here to see if there were any other sites like Japanese Recall. I forgot. That's one, thanks. Edit: Oh wait you actually recommended that on page one but I guess I just didn't see it or something.
Edited: 2011-05-17, 8:17 am
Reply
#84
wccrawford Wrote:
caivano Wrote:
wccrawford Wrote:Don't focus on what others are doing, or claim to be doing.
I agree, but it is useful to try out the methods that people suggest.
I'll agree with 'can be useful'. It can also be a major waste of time, distract you from your studies, and lead you down a spiraling path to despair.
I don't know what stuff you have been trying out, but I've never experienced any of those. Maybe a little short term stress from pushing myself, but that's a fair trade off for long term improvements.

Even if something doesn't work so well I still learn from it in terms of Japanese and study methods. If it does work, my studying becomes more efficient and enjoyable.
Reply
#85
I just read through a wiki article (submitted by fakewookie in What I Learned Today thread) and noted every word from the first paragraph that was unknown to me. Here is their list along with JDIC "Popular" qualification and their respective Google numbers (in that particular form and only on .jp domains). I didn't add their meaning (so you can check yourself how many would you know) but its also a factor when building/choosing vocabulary.

How should I judge which are worth adding/remembering while keeping the process fairly simple and quick? How do you approach this problem? Note that until now I added almost every word I had encountered and didn't exclude any based on (P)/Google/meaning but I don't know if its a good strategy and isn't hindering my performance in acquiring basic fluency.

There are supposedly around 20k words in (P) and they account for 23 out of 33 words on this list, so if I were to exclude based on (P) I wouldn't know 10 words in this short paragraph. That could be enough to understand this article but it doesn't mean the same goes for anything that is written using specific terms (here its a lot about Emperors, eras and newspapers/publishing).

Google count also isn't a good measure (compounds+different readings+differences between spoken&written language) but its a valuable variable nonetheless.

Meanings can also be tricky, because some words might be more popular in papers/web pages, some are more generic but infrequently used (a more popular form exists) and some are archaic forms (recently I had an episode with 拵える which I've been told is an archaic word, Goo seems to agree but JDIC fails to mention that and I acquired it from a JLPT2 list).

While writing this post I also discovered that both Rikaichan and Rikaikun in their newest versions have very different opinions about what is considered (P) by JDIC. Based purely on Rikais there were 12 (P) words in this list while on JDIC there are 23. This makes the job of figuring out what word to add even harder...

崩御 337,000
折 (P) 28,600,000
元号 (P) 764,000
めぐる (P) 24,500,000
誤報 (P) 2,860,000
同日 (P) 13,600,000
聖上 430,000
号外 (P) 4,970,000
及び (P) 190,000,000
選定 (P) 20,800,000
宮内省 2,430,000
辞意 (P) 713,000
表明 (P) 20,100,000
編輯 (P) 1,490,000
主幹 (P) 1,600,000
収拾 (P) 2,130,000
諸説 2,130,000
枢密院 89,100
もたらす (P) 14,700,000
定か (P) 4,170,000
一説 1,470,000
漏洩 29,000,000
内定 (P) 15,400,000
急遽 (P) 15,800,000
影法師 361,000
関係者 (P) 65,700,000
記載 (P) 134,000,000
番記者 196,000
張り付く 660,000
皇室 (P) 25,600,000
回顧 (P) 10,500,000
よれば (P) 18,400,000
ものの (P) 104,000,000

Maybe I should just write a program to automate this process using various measures to qualify a word/word list with warning signs if a word doesn't qualify because of sth.
Reply
#86
don't worry so much about it. You may as well add them all, with rikaichan it only takes a second. If you're getting sick of a word you can alway delete it later anyway, or make use of the leach settings. Otherwise just make an arbitrary guess based on your own feeling. If a word is useful it'll keep popping up so you can always add it later.
Reply
#87
PensukeD: That's obviously got to be false, they say Shakespear knew about 66,534 words, so that prof is saying the average English speaker knows more than Shakespeare ever knew, despite the fact the average English speaker can't even understand Shakespeare?

Coreth: Sounds like you have a bit of an attention problem. I would suggest first blocking the stes that are obvious a deteriment, like youtube, facebook, twitter. Then get rid of any skinner-box games like WoW or Farmville if you play them. Then drop the cellphone, the TV. If all of that doesn't work, get rid of the computer.

All these multitasking, dopamine driven 'activities' basically train your brain to be addicted to these things and hurt your ability to concentrate and maintain a attenion span.

I know this sounds crazy, but I haven't had TV for 6 years. Yep, most people I tell this to consider me insane for not spending up to $100 per month on a commercial ridden entertainment box that despite having 1000 channels, there's still nothing on that I really want to watch, ever. But most people are so addicted to TV, they can't imagine a life without it.

If you got rid of all these things and just had a book of Japanese and nothing else to do, you'll be amazed at how much you'll get done. This is how I imagine Heisig did his RTK run.

Also, I know exactly what you're goin through in the OCD department. I have severe OCD to the point of being borderline housebound. :/
Reply
#88
KMDES Wrote:PensukeD: That's obviously got to be false, they say Shakespear knew about 66,534 words, so that prof is saying the average English speaker knows more than Shakespeare ever knew, despite the fact the average English speaker can't even understand Shakespeare?/
Err... i guarantee you i know a few words that Shakespeare didn't. Let's start with "television". You reckon Shakespeare would be able to read a modern scientific paper fluently? Why is it surprising that literature written over 400 years ago would be hard to read by modern readers? Shakespeare used to make up words left, right and centre. Some never stuck. Others were common words in his day but have since died out. Most of the problems i had with Shakespeare were due to obsolete expressions and cultural references that people in his day would have got, but make no sense without explanation now. Words themselves were a minor hindrance.
Reply
#89
thurd Wrote:While writing this post I also discovered that both Rikaichan and Rikaikun in their newest versions have very different opinions about what is considered (P) by JDIC. Based purely on Rikais there were 12 (P) words in this list while on JDIC there are 23. This makes the job of figuring out what word to add even harder...
P is irrelevant. It's only really helpful in helping pick among multiple definitions of a word (ie weeding out obscure and obsolete ones). What does it matter if a word is on a common words list or not? If the word was an important word in something you consider it important to be able to read... then you need to know it, right?

My strategy is to add words that appear twice on different days. Skips the most obvious once off words.
Reply
#90
http://www.opensourceshakespeare.org/stats/
Reply
#91
That and it's pretty well known that Shakespeare had an equivalently much larger vocabulary than the average person and linguists tend to flaunt said fact. So if he had a 66,000 word vocab and had a larger number in her vocab than most people, yet the average person knows 80,000, the math doesn't really work out at all.

And also, people in his era didn't really speak the way the did in his plays, so it was just as confusing for them as us. You really just need to think of the words in a different way to understand them anyway.
Edited: 2011-05-17, 4:39 pm
Reply
#92
nest0r Wrote:http://www.opensourceshakespeare.org/stats/
If i'm reading that right, there are only 28,829 unique words in shakespeare's plays. Where's 66,000 coming from?

This heavily depends on what you define a "word" as. IMHO, if it's got a dictionary entry, it's a word. Ie. blood transfusion is a word. The fact it is conventionally written with a space means nothing. If it were just a combination of two words, then saying "Blood injection" wouldn't get you strange looks. If you're uncomfortable with that definition of a word, lets call it a "vocabulary item" (nestor's probably got a better word lol). Regardless, it is those that a speaker of the language must know, and they far outnumber simple space delineated words.
Reply
#93
zigmonty Wrote:
thurd Wrote:While writing this post I also discovered that both Rikaichan and Rikaikun in their newest versions have very different opinions about what is considered (P) by JDIC. Based purely on Rikais there were 12 (P) words in this list while on JDIC there are 23. This makes the job of figuring out what word to add even harder...
P is irrelevant. It's only really helpful in helping pick among multiple definitions of a word (ie weeding out obscure and obsolete ones). What does it matter if a word is on a common words list or not?
It's a filter, that's all, for reducing the large number of unknown words you encounter to a smaller set to focus on for the moment. It's more important that this kind of filter is straightforward and easy to apply than that it is fantastically accurate at picking some theoretical absolute best set of words to look at. (...and this argument also leads to the conclusion that it doesn't matter whether you're looking at the latest EDICT (P) markings or the ones from the older version of the dictionary that Rikaichan uses.)
Quote:My strategy is to add words that appear twice on different days. Skips the most obvious once off words.
That's another good filter, yes.
Reply
#94
zigmonty Wrote:
nest0r Wrote:http://www.opensourceshakespeare.org/stats/
If i'm reading that right, there are only 28,829 unique words in shakespeare's plays. Where's 66,000 coming from?

This heavily depends on what you define a "word" as. IMHO, if it's got a dictionary entry, it's a word. Ie. blood transfusion is a word. The fact it is conventionally written with a space means nothing. If it were just a combination of two words, then saying "Blood injection" wouldn't get you strange looks. If you're uncomfortable with that definition of a word, lets call it a "vocabulary item" (nestor's probably got a better word lol). Regardless, it is those that a speaker of the language must know, and they far outnumber simple space delineated words.
"How many words did Shakespeare know?
In his collected writings, Shakespeare used 31,534 different words. 14,376 words appeared only once and 846 were used more than 100 times. Using statistical techniques, it's possible to estimate how many words he knew but didn't use.

This means that in addition the 31,534 words that Shakespeare knew and used, there were approximately 35,000 words that he knew but didn't use. Thus, we can estimate that Shakespeare knew approximately 66,534 words.

According to one estimate, the average speaker of English knows between 10,000-20,000 words."
Reply
#95
KMDES Wrote:
zigmonty Wrote:
nest0r Wrote:http://www.opensourceshakespeare.org/stats/
If i'm reading that right, there are only 28,829 unique words in shakespeare's plays. Where's 66,000 coming from?

This heavily depends on what you define a "word" as. IMHO, if it's got a dictionary entry, it's a word. Ie. blood transfusion is a word. The fact it is conventionally written with a space means nothing. If it were just a combination of two words, then saying "Blood injection" wouldn't get you strange looks. If you're uncomfortable with that definition of a word, lets call it a "vocabulary item" (nestor's probably got a better word lol). Regardless, it is those that a speaker of the language must know, and they far outnumber simple space delineated words.
"How many words did Shakespeare know?
In his collected writings, Shakespeare used 31,534 different words. 14,376 words appeared only once and 846 were used more than 100 times. Using statistical techniques, it's possible to estimate how many words he knew but didn't use.

This means that in addition the 31,534 words that Shakespeare knew and used, there were approximately 35,000 words that he knew but didn't use. Thus, we can estimate that Shakespeare knew approximately 66,534 words.

According to one estimate, the average speaker of English knows between 10,000-20,000 words."
This sort of stuff reminds me why i like being an engineer. Estimates piled on estimates, none of which resulting in anything useful (except shakespeare hero-worship). Maybe his active vocab was a bigger fraction of his total vocab than the average person? Why is that a less reasonable assumption than that his vocab was far bigger than an average person? Considering how many words he coined, are we assuming he was coining words into his passive vocabulary too? Who really cares?

Regardless, my point was that the estimate that english speakers know 80,000 words was probably counting compounds whereas the 66,000 for shakespeare is probably counting only actual space-delineated words.

I know far more than 20k english words. I know nearly half that many Japanese words, and my english vocab is a *lot* bigger than double my japanese vocab.
Reply
#96
I think it's ridiculous to keep memorizing words after you've reached a certain level. Once you're able to read most things new vocabulary should come from immersion, not from obsessively striving to reach a magic number of words in your SrS.
Reply
#97
Kuma01 Wrote:I think it's ridiculous to keep memorizing words after you've reached a certain level. Once you're able to read most things new vocabulary should come from immersion, not from obsessively striving to reach a magic number of words in your SrS.
I only partially agree with you. As most of us can plainly see with our native tongue, past a certain point you can easily pick up unknown words in context most of the time and stand a good chance of remembering them without having to see them all that much. However, this doesn't mean an SRS is no longer useful, it just means you need to adapt your SRS algorithm to fit the new situation. For example, perhaps at that point a better initial interval of 1wk or more instead of a day would be more appropriate.

Ultimately it's a question of how much time do you spend by SRSing the new word versus the expected time loss from needing to look it up in the future. The faster your fact adding mechanism and the more tuned your initial interval/eases are, the less risk you should be willing to take for forgetting the word.

If you're spending more than a few seconds to add a card then you're clearly not leveraging all the tools available right now (rikaichan mod, subs2srs, etc), let alone what people will make in the future.
Edited: 2011-05-17, 5:58 pm
Reply
#98
Kuma01 Wrote:I think it's ridiculous to keep memorizing words after you've reached a certain level. Once you're able to read most things new vocabulary should come from immersion, not from obsessively striving to reach a magic number of words in your SrS.
I don't think anyone here was claiming otherwise?
Reply
#99
thurd Wrote:(recently I had an episode with 拵える which I've been told is an archaic word, Goo seems to agree but JDIC fails to mention that and I acquired it from a JLPT2 list).
That kanji is rare, but the word is not archaic at all -- I don't see where Goo says it's archaic.
Reply
Even natives have to memorize new words, sometimes a few every day. I think SRS only makes this easier. In fact, more people need to use the SRS for their native language to ensure their active mental lexicon is large and stable. In the future if this becomes common, I think this will make discourse more flexible, creative and efficient, both in general and among particular discourse communities.
Edited: 2011-05-17, 7:38 pm
Reply