kanji koohii FORUM
Wordlists, Anki, and Exel ... - Printable Version

+- kanji koohii FORUM (http://forum.koohii.com)
+-- Forum: Learning Japanese (http://forum.koohii.com/forum-4.html)
+--- Forum: Off topic (http://forum.koohii.com/forum-13.html)
+--- Thread: Wordlists, Anki, and Exel ... (/thread-7587.html)



Wordlists, Anki, and Exel ... - Biene - 2011-04-01

Currently I'm distracting myself - again - from my Japanese studies by preparing a wordlist in Exel for my Dutch language study, and would like to have your advise/help with this.

After searching the web (and Anki) for useful (word)lists of the basic vocabulary (~ 2000 - 4000 words), I found several decks and lists. Unfortunately non of them were what I was looking. The decks either ignored the fact that you still need to know the gender of (most of) the "de" words (e.g. de vrouw (f), de lucifer (m)), ignored gender of nouns altogether, or were prepared automatically and full of errors. While the wordlists were also only prepared automatically and still full of conjugations and names, and in addition missed basic vocabulary, since they were prepared using newspaper articles.
So in the end I decided to prepare my own list in Exel, by filtering one of the better decks while going through the indices of three books (Nederlands voor buitenlanders, Langenscheidt Basiswortschatz Niederländisch, and Assimil Niederländisch) and adding missing vocabulary. I'm not very versed in Exel, but I figured that it would allow me to add the different tag-fields I plan to have in the Anki deck later on, as well as giving me the possibility to sort the vocabulary to my liking.

So now my question(s):
(I use Exel 2002 but I have access to Exel 2007 if need be)

1) E.g. in one column there would be the Dutch words "de sport", "de record", "de race", "het team", "het voetbal", "de voetball", ..., while in another column the corresponding German words would be. Would it be possible to tell Exel to sort this column alphabetically while ignoring the "de" and/or the "het"? Also while sorting this column alphabetically Exel should of course keep the corresponding columns with the translations (e.g. "de sport" - "der Sport", ...) and tags (e.g. "noun", "family", "nature", ...).

2) Following up on question #1, would it be possible to compare two wordlists (of different length) with each other? What I mean is, when I have my wordlist with the different fields (e.g.: "de sport" - "der Sport" - German-gender (m) - Dutch-gender (m) - noun) and want to arrange this list according to a wordfrequency-list (which contains more words than my basic wordlist), how would I go about this without messing up the different fields/columns?

I hope my questions are not too confusing, but I don't know how to describe it better what I'm planing to do. Thanks for your help.


Wordlists, Anki, and Exel ... - prink - 2011-04-01

First of all, it's Excel. Not Exel.

I don't understand your question completely, but I have an answer for how to sort ignoring the de's and the het's. Just do two columns. Column A: de, Column B: sport, Column C: ";", Column D: translation. Select everything and then sort by column b. When you finish and are ready to import to Anki, copy your spreadsheet into Notepad++. You should end up with "de[tab]sport[tab];[tab]translation". Copy and replace "[tab]" with a single [space] and "[tab];[tab]" with just a ";", and it'll import into Anki nicely.

Good luck!


Wordlists, Anki, and Exel ... - IceCream - 2011-04-01

ok, there's probably loads of better way to do this, but this would be my way of getting there:

1. Select all the columns in your chart, and sort by the column with the dutch word in alphabetical order. This should give you a list of everything beginning with "De" first, then "Der", then "Het", etc. Add an extra column to the left hand side, and then type in "De", and copy it down to the relevent point, then do the same for "Der", etc.
Now select all the words in the column with the dutch word, and press ctrl+F. This should bring up the find and replace thing. Go to the Replace tab. Type in find: "De ", and leave the replace column blank. This should delete all the "De "'s out of your column and leave the word at the start. Repeat for der and het.
Now you can sort all the columns by the alphabetical order of the dutch word again, and you should have the result you want.

2. This is more difficult, since i don't have excel, but you can try this... Add the frequency list to the right of everything else. Also add another column next to it with a corresponding number, 1,2,3,4, etc running down the column.
Now select everything and sort again, this time by word frequency first, then by dutch word. Hopefully what you'll end up with is a list where two words the same line up next to each other.
If that does work, you then write the following formula (use the button to find the correct formula words)... anything in square brackets here shouldn't be written on the sheet
(IF [e.g.]A1[dutch name] = E1 [word frequency] THEN (value=1)) AND (IF A1 does not equal E1 THEN (value=0)).
Copy this formula down the whole list. What you should end up with is a "1" in the box where the two words are the same, and a 0 where they aren't.
Next, sort the whole lot again, this time by the number in the column. Now all the 0's should be at the top. You can delete all of these, as they don't have corresponding words in your dutch word list.
Sort the whole lot again, this time in the order of the numbers in the column you added at the start of this (1,2,3,4).
Delete all the columns you don't need now, and everything should be in the right order...

it's a bit messy, but it should hopefully work if nobody else knows anything better... Smile


Wordlists, Anki, and Exel ... - Biene - 2011-04-06

prink Wrote:First of all, it's Excel. Not Exel.
Haha, indeed it is! I've never realized that it's actually Excel instead of Exel, since I (mis)pronounce the program-name Excel entirely different to the word excel. Now all the English comments about that program make much more sense.

Thanks you two for answering my question, I know I described it quite confusing, but you managed to make some sense of it.

1) I had hoped to avoid to use several columns for "de" and so on, but I guess there is no way round and your ideas sound very practical. Especially the idea of adding an extra column with ";". Never would have thought of this, but it should make things later on much more simple.

2) It certainly is more difficult than I had expected, and so far I haven't managed to get it to work yet, but I'm sure I'll get it to work by the weekend.

Again thanks a lot for your help.