I'm working on adding example words to the flashcards.
I'd rather keep the layout as simple as possible (in fact I believe it would be more efficient to have just one example word per card, but that can always be added as an option).
With this in mind, I wonder if it's important to distinguish whether the On or Kun reading is being displayed, for each word?
Nice idea! I'm already using Rikaisama when reviewing so I can see how useful such a feature would be. Besides, it would be a nice addition to silence RTK naysayers.
You could use hiragana for kun'yomi, katakana for on'yomi and periods or dots to separate the readings corresponding to each kanji.
For example:
人目:ひと・め
名目:メイ・モク
About how many or what words to display, it probably would be too complicated to implement, but it would be even more useful if you could display a list of words and then check the ones you want permanently displayed on your card. If the number of people who have chosen to display each word could be seen by the side of each word too, and words could be ordered by "popularity" (similarly to how kanji stories are ordered), that would be awesome.
Just my 2 cents.
Sebastian wrote:
You could use hiragana for kun'yomi, katakana for on'yomi and periods or dots to separate the readings corresponding to each kanji.
For example:
人目:ひと・め
名目:メイ・モク
Sounds good. What do other people think about the separators? For technical reasons I'd rather not use them, as it will make the layout more difficult (will take more space). On the other hand I was planning on using colour to emphasize the individual reading of the current kanji.
Sebastian wrote:
About how many or what words to display, it probably would be too complicated to implement, but it would be even more useful if you could display a list of words and then check the ones you want permanently displayed on your card. If the number of people who have chosen to display each word could be seen by the side of each word too, and words could be ordered by "popularity" (similarly to how kanji stories are ordered), that would be awesome.
That's what I had in mind.
I'm going to do this incrementally though so for this first implementation what I've got currently is one random Onyomi example word, and one random Kunyomi example word; both selected from "priority" entries in JDICT. So some kanji have no On, some have no Kun, and some have no words at all, which I think may be better than burdening oneself with uncommon words.
This may be appropriate if I add this first to the non-SRS mode. So each "cram" session can show up different words.
If I use the first results I can sort them by priority but they won't necessarily be the most common or logical examples because there is no manual selection yet. Hence random.
That's a step up from the "Vocab Shuffle" I did, because here at least you're supposed to know at least one kanji in each word.
I like your layout a lot. I probably wouldn't add dividers. They're more stuff to parse, but I don't think it's a big deal either way. I guess using katakana for on-yomi is reasonable. I personally might use a different color, instead, but the difference is small, in my opinion.
Last edited by bertoni (2012 May 25, 3:56 pm)
Okay. What colour would you suggest? Something more contrasty like red?
No, red just means eyestrain, IMO. I'd use blue or brown, most likely.
ファブリス wrote:
Okay. What colour would you suggest? Something more contrasty like red?
Red, Orange or Green would be a good for the keywords. As, it seems to be for not memorizing the reading distingishing between 音読み and 訓読み isn't really needed. It would make it more cluttered.
Only one suggestion is that it would be better to have 画数 smaller than actually stroke count number. It seems as though the 5 is only a footnote.
Omoishinji wrote:
As, it seems to be for not memorizing the reading
Something I didn't explain: the plan is to hide both the definition, and the reading. They will show on hover or when clicked (on tablet).
The goal is to expose the reading in context rather than on its own. If it can't find a common On or Kun reading out of 20000+ words tagged as priority entries in JDICT why burden oneself with the reading?
Ideally, I'd have also the example words built on kanji already know. It's doable to some extent, and would be rather neat, but that's for later.
For now it will be enabed first on the free reviews so as not to disrupt the existing mode, and get more feedback and see what people want out of it.
Hmm, using hover and click sounds like a good way for people to get going. I'd like to have the ability to use the keyboard, too, as an accelerator. I'd like to upload my vocabulary cards when it's ready. ![]()
If you can use the Core 2k/6k/10k list as the primary words and English definitions, I think it'll be better than the lengthier definitions given in the current sample. With that, it won't look as cluttered if you used 2-3 kunyomi and 2-3 onyomi examples.
The database can even have priority based on CBs vocabulary frequency results. This'll ensure a more common sample word is used.
I am just puzzled about your example 目茶苦茶: why does it have this meaning?
Anyone knows?
All I could find is that "目茶" and "無茶" are Ateji with the meaning "absurd; ridiculous; nonsense;" - so I suppose that the whole expression is in fact oral with no connection
to the meaning of the kanji? So in fact a pretty nonsensical example?
Randomly found this on ALC:
"めちゃくちゃいけてる女
fly-ass girl"
Gee. I've never ever seen this in English before. I did not know assez had this capacity of flying...
Nukemarine wrote:
If you can use the Core 2k/6k/10k list as the primary words and English definitions, I think it'll be better than the lengthier definitions given in the current sample. With that, it won't look as cluttered if you used 2-3 kunyomi and 2-3 onyomi examples.
The database can even have priority based on CBs vocabulary frequency results. This'll ensure a more common sample word is used.
Please note as Sebastian guessed, in the longer term the plan is to let the user pick what words they want. Once users can pick their example words, lists can be made to represent Core 2k, or anything else. JDICT is provides a common, exhaustive set, with "preset" definitions that save users time.
Technically speaking the definitions displayed on there are a concatenation of several glosses. The first gloss (should be the one the most in use) would have been "(adj-na,n) (1) (uk) absurd; unreasonable; nonsensical; preposterous; incoherent;".
I purposely selected long definitions to see how it would fit in the Photoshop layout.
louischa wrote:
I am just puzzled about your example 目茶苦茶: why does it have this meaning?
Anyone knows?
It's a 四字熟語, a type of word that is made up of 4 kanjis and has a unique (often unrelated and/or idiomatic) meaning on its own. Here is the wikipedia article on it
I don't think 目茶苦茶 is a 四字熟語, it's just using 当て字 to spell out めちゃくちゃ. 四字熟語 is stuff like 異口同音 where the characters indicate the meaning, but it doesn't necessary use existing vocabulary or follow the usual grammar conventions.
I'd use the longer definitions, personally. I thought that the design was such that people pick their own, though.
ファブリス wrote:
Omoishinji wrote:
As, it seems to be for not memorizing the reading
Something I didn't explain: the plan is to hide both the definition, and the reading. They will show on hover or when clicked (on tablet).
If you have to click/hover for each kanji, it could get tiring soon. What about a switch you can turn on/off?
Last edited by Sebastian (2012 June 01, 9:13 pm)
Hi, please could you tell me roughly how you identify the readings corresponding to each kanji? (I was working on something to do with that, and it's already got rather complicated, so reading this thread I wondered if there's something important I don't know about.)
@HelenF It's a kind of recursive algorithm which I pieced together after doing lots of search on the web. I think there are some data files that can be found from some projects and there were plans to include split readings in JMDICT I don't know how that turned out. I'm afraid I put the code together in Perl several years ago and I wouldn't dare touch the code now
You could ask on the KANJIDIC mailing list, in which Jim Breen posts. Though a thorough search on the web might turn out better solutions or even data files that didn't exist when I first looked at this. ps: Back then I was also told here about some Japanese tools that can parse text and give the furigana, perhaps others can chime in? I don't know what it was, nor could I read the documentation >_>
Ah, what I've done is probably similar then.
I don't have much idea of how many of the words can be solved with an algorithm, how many others have uncommon variants that could still be argued to be individual kanji readings, and how many don't have individual kanji readings at all. I suspect quite a bit of manual checking would be involved in getting a good answer, so a central data table e.g. in JMdict would be a useful thing.
I think people were talking about using Mecab to generate furigana, but that works for morphemes, not the individual kanji. Split furigana appear in some printed material though.
I'll search some more.

