I just read through a wiki article (submitted by fakewookie in
What I Learned Today thread) and noted every word from the first paragraph that was unknown to me. Here is their list along with JDIC "Popular" qualification and their respective Google numbers (in that particular form and only on .jp domains). I didn't add their meaning (so you can check yourself how many would you know) but its also a factor when building/choosing vocabulary.
How should I judge which are worth adding/remembering while keeping the process fairly simple and quick? How do you approach this problem? Note that until now I added almost every word I had encountered and didn't exclude any based on (P)/Google/meaning but I don't know if its a good strategy and isn't hindering my performance in acquiring basic fluency.
There are supposedly around 20k words in (P) and they account for 23 out of 33 words on this list, so if I were to exclude based on (P) I wouldn't know 10 words in this short paragraph. That could be enough to understand this article but it doesn't mean the same goes for anything that is written using specific terms (here its a lot about Emperors, eras and newspapers/publishing).
Google count also isn't a good measure (compounds+different readings+differences between spoken&written language) but its a valuable variable nonetheless.
Meanings can also be tricky, because some words might be more popular in papers/web pages, some are more generic but infrequently used (a more popular form exists) and some are archaic forms (recently I had an episode with 拵える which I've been told is an archaic word, Goo seems to agree but JDIC fails to mention that and I acquired it from a JLPT2 list).
While writing this post I also discovered that both Rikaichan and Rikaikun in their newest versions have very different opinions about what is considered (P) by JDIC. Based purely on Rikais there were 12 (P) words in this list while on JDIC there are 23. This makes the job of figuring out what word to add even harder...
崩御 337,000
折 (P) 28,600,000
元号 (P) 764,000
めぐる (P) 24,500,000
誤報 (P) 2,860,000
同日 (P) 13,600,000
聖上 430,000
号外 (P) 4,970,000
及び (P) 190,000,000
選定 (P) 20,800,000
宮内省 2,430,000
辞意 (P) 713,000
表明 (P) 20,100,000
編輯 (P) 1,490,000
主幹 (P) 1,600,000
収拾 (P) 2,130,000
諸説 2,130,000
枢密院 89,100
もたらす (P) 14,700,000
定か (P) 4,170,000
一説 1,470,000
漏洩 29,000,000
内定 (P) 15,400,000
急遽 (P) 15,800,000
影法師 361,000
関係者 (P) 65,700,000
記載 (P) 134,000,000
番記者 196,000
張り付く 660,000
皇室 (P) 25,600,000
回顧 (P) 10,500,000
よれば (P) 18,400,000
ものの (P) 104,000,000
Maybe I should just write a program to automate this process using various measures to qualify a word/word list with warning signs if a word doesn't qualify because of sth.