Back

Project help needed

#1
Michiel Kamermans, the author of the book An Introduction to Japanese Syntax Grammar & Language, is currently looking for some help, for one of his projects. It has to do with Kanji, so i thought i ask here, if anybody would want to spend some time to help him out.

The project is about breaking down kanji, and forming new kanji with them, to generate a frequency list which will be used, for his new book. There is still much work to be done in fact. 2.000 kanji have to be processed, 1.500 are already done. You can read more about it on the project homepage. Once this is done, he will start writing a new book, which will also be free, about Vocabulary and Kanji. Sorted in a way from the most frequently used 3.500 kanji to least common ones, with sentences, and vocab and reading sections. This is the project homepage with more information, about how you can participate, if you like to.

Knowing how supportive our community can be, i thought i post about this here, to find some folks who might be willing to help. So, if you know about his free book, or have used it in the past, but didn't donate or buy the physical copy, now is the time to pay him back. Or if you know someone, who has time, and would be willing to do something for him, spread the news. And if you have never heard of him, or his book, this is the perfect time to take a look at it. It is freely available, and it's a beginners grammar book, so take a look if you like.
Edited: 2011-07-03, 2:08 pm
Reply
#2
I'm not sure I understand this.
Isn't the task of breaking the kanji down into smaller components already done? Edict has this. Heisig did it as well. I don't understand what new thing is being accomplished here.
Reply
#3
There's also this:
http://kanji.zinbun.kyoto-u.ac.jp/projec...so-2022-jp
Reply
May 16 - 30 : Pretty Big Deal: Save 31% on all Premium Subscriptions! - Sign up here
JapanesePod101
#4
Zarxrax Wrote:I'm not sure I understand this.
Isn't the task of breaking the kanji down into smaller components already done? Edict has this. Heisig did it as well. I don't understand what new thing is being accomplished here.
I was asking him the very same question. The problem seems to be that, since he wants to distribute his next book for free, he can't use some ready made stuff, because he wouldn't be allowed to share it. There was also something with the display of kanji, which should be the same across all browsers, and of course in the .pdf This cannot be guaranteed with paid or CC Licensed fonts, or databases. So he wants to create a DB of kanji, which are broken down and reassembled to form new kanji, with the help of some people. For a full explanation for how and why he does it, i suggest you ask him, and I'm sure he will explain it. And way better than i ever could.
Edited: 2011-07-03, 6:55 pm
Reply
#5
There's also KanjiVG. Their work is released under the Creative Commons Attribution-Share Alike 3.0 license, so I guess Michiel or people collaborating with the Kanji Composition project could use their data.

Quote:KanjiVG is a description of the sinographs (or kanji) used by the Japanese language. For each character, it describes its structure and components up to the strokes types. It also provides a SVG drawing of the character with the right stroke order and direction, allowing to easily create stroke order diagrams, fonts, stroke order animations, etc.
You can see an implementation of KanjiVG in a dictionary called Tagaini Jisho. In Tagaini Jisho you can, for example, use component search input and look up kanji by writing its componens (女+子=好). You can also see detailed stroke order diagrams where you can click the different components of the kanji and open a new popup with that components stroke order diagram and info.


[Image: kanjipopup.png]

[Image: component_popup.png]

P.S.: I already checked a couple of kanji for Kanji Composition. I see no reason not to collaborate.
Reply
#6
The grammar book is sold commercially, and is also available on the web at no charge, but licensed as "You are not permitted to create derivative works, nor may you sell, or bundle with a sale for the purpose of enriching the sold product, the content from this website either in original or derived form."

If the plan for the new book is the same, this combination prevents the use of pretty much anything that's been made before. (Well, depending on whether the book would count as a "derivative work" of the source in question. But sounds like it probably would be one in this case.)
Edited: 2011-07-03, 9:03 pm
Reply
#7
I've been reading more on the superior benefits of unsuccessful retrieval, delayed feedback, and incremental feedback, in the context of my latest interest in playing with the dynamics of the fronts and backs of cards (cues, targets, feedback).

Something I thought of in this context of the radical decomposition that's connected to characters is, it would be interesting to make feedback incremental by radical for RTK. That is, you're not changing how you grade the card, but once you decide whether you failed, instead of immediately looking at the answer, you continue to try and generate it through the incremental hints, such as pieces of kanji (i.e. primitives/radicals). Then you fail it. Edit: For cards you get correct, as I've mentioned before, feedback is still optimal (i.e. for reconsolidation/augmentation/reinforcement), and I've read that delayed feedback in this case would be even better (3 second delay). Also, by primitives I meant the terms used in Heisig, which would be a handy alternative to radicals and also a possible bridge in implementing them both.

Right now I'm playing with that incremental hint Anki plugin (original, as the advanced version doesn't work for me) to toy with different ideas of cue informativeness and incremental feedback. At the moment I like the idea of progressive component-based aspects for incremental feedback, and multimodality for cue modulation. Target dynamics are centered around ‘type’ at the moment (prosody, meaning, gist/comprehension, etc.).

By the way, on this same topic while I'm now looking into card durations also (e.g. 10-15 seconds per card, max—this being connected in part to the Four Strands, and relatd to working memory), I've changed my perspective on whether one should work for answers rather than fail immediately (re: an article phauna posted in 2008). Studies since 2009 (Kornell, Metcalfe, et al., and ditto for feedback stuff) show that unsuccessful tests and incorrect answers are beneficial in the long run, when, of course, correct feedback/answers are given/occur over time.

References

“Blockers” do not block recall during tip-of-the-tongue states
Unsuccessful Retrieval Attempts Enhance Subsequent Learning
The Pretesting Effect: Do Unsuccessful Retrieval Attempts Enhance Learning?
Does Incorrect Guessing Impair Fact Learning?
Delaying feedback by three seconds benefits retention of face–name pairs: the role of active anticipatory processing
Scaffolding feedback to maximize long-term error correction

My impression of the prestesting stuff in comparison to other studies on studying/testing trials (as opposed to the foci of the above papers on whether it's intrinsically harmful/superior to studying alone) during learning phases is that it could be integrated by delayed/incremental feedback on initial encounters of new cards (i.e. just before presenting both sides of the card for initial study).
Edited: 2011-07-04, 1:30 pm
Reply
#8
Now this sounds great! You know what, nest0r, i like you and i hate you to death, so many papers you keep posting, how will i ever find the time to actually read them! Where might i get this hint-based system, can you give me a link, please? And even though this is a bit off-topic, when will you start publishing your own books, or is there a blog or something with all your wisdom gathered in just one place?

Oh, and i will tell Mr. Kamermans about this topic, maybe he can shed some more light on his project and explain some things. Or at least nest0r will understand him.
Edited: 2011-07-04, 1:58 pm
Reply
#9
I was referring to a shared Anki plugin called ‘incremental hints’, but I think modifications or layout-based customizations will be necessary to meet goals such as words/radicals. As I mentioned, I couldn't get the advanced version of the above to work (which features more options); there's also a ‘hint-peeking’ plugin that doesn't work for me, and a ‘two-step answer’ option I haven't tried yet.

Tangentially, here's a nice accompaniment primer to the previously linked “Four Strands” paper:

Four Principles of Memory Improvement: A Guide to Improving Learning Efficiency

Abstract: Recent advances in memory research suggest methods that can be applied to enhance educational practices. We outline four principles of memory improvement
that have emerged from research: 1) process material actively, 2) practice retrieval,
3) use distributed practice, and 4) use metamemory. Our discussion of each
principle describes current experimental research underlying the principle and
explains how people can take advantage of the principle to improve their learning.
The techniques that we suggest are designed to increase efficiency—that is, to
allow a person to learn more, in the same unit of study time, than someone using
less efficient memory strategies. A common thread uniting all four principles is
that people learn best when they are active participants in their own learning.

I like how it covers aspects of the benefits of spacing on other types of memory that are often overlooked (e.g. motor memory, mathematics). I glommed on to some new stuff related to the ‘region of proximal learning’ via this paper also, as it pertains to learning the easiest items first when allocating study time, related to my idea that we should be able to sort each review session by, say, iPlusN; originally it was just about making for smooth sessions where you can slowly increase intensity, but it also is related to maintaining flow, the balance between desirable and difficulty, and seems to have positive effects on metacognitive awareness, such as accurate judgments of learning, perceived rates of learning and persistence of study. While I was at it, I noted that in addition to feedback being good for successful retrieval and correctively for incorrect retrieval, and for correcting low confidence metacognitive errors, it's also apparently good for high confidence errors (‘hypercorrection’ being the term for this dynamic of correcting the latter).

Another expansion to that four principles paper is: A cognitive-science based programme to enhance study efficacy in a high and low risk setting

It hits on the same points but also notes the importance of multimodal learning, relational memory, and encoding specificity/transfer-appropriate processing.
Edited: 2011-07-04, 3:46 pm
Reply
#10
Nagareboshi Wrote:And even though this is a bit off-topic, when will you start publishing your own books, or is there a blog or something with all your wisdom gathered in just one place?.
Yeah, we want an ebook by nest0r proposing a systematic, scientifically based method on how to study East Asian languages in an optimal way! First for the community and then for the world! Cool Preferably one written in an engaging style, not like if it was written by a scientist:
Quote:Everyone knows that scientists write badly - everybody, that is, except scientists. They think they're merely being precise and orderly, and everyone else on the planet is either (a) illiterate, (b) sloppy, © a humanist, or (d) all of the above. (Ref. 1) In some cases, of course, the individual scientist is not well acquainted with the English language. (In the opinion of English scientists, this frequently explains the unintelligible papers of Americans.) Avoid these papers.

The scientist is, by his reliance on the passive voice, hobbled, leading to sentences like this one, in which the subject, a lumpy noun, is acted upon by pallid adjectives and wan verbs, all without ever saying exactly who the action is done by, so that the sentences get longer and longer as you read and never seem to end, even when there is clearly nothing more to say in the sentence, at which point the reader sometimes gets a meager little semicolon; this gives him a rest, so that he can go on and read another long phrase without really learning anything more, because the writer's hand has kept on moving even though his brain is disengaged.
(from Cosmos Online)

And yes, I'm serious.
Reply
#11
gdaxeman Wrote:
Nagareboshi Wrote:And even though this is a bit off-topic, when will you start publishing your own books, or is there a blog or something with all your wisdom gathered in just one place?.
Yeah, we want an ebook by nest0r proposing a systematic, scientifically based method on how to study East Asian languages in an optimal way! First for the community and then for the world! Cool Preferably one written in an engaging style, not like if it was written by a scientist:
Quote:Everyone knows that scientists write badly - everybody, that is, except scientists. They think they're merely being precise and orderly, and everyone else on the planet is either (a) illiterate, (b) sloppy, © a humanist, or (d) all of the above. (Ref. 1) In some cases, of course, the individual scientist is not well acquainted with the English language. (In the opinion of English scientists, this frequently explains the unintelligible papers of Americans.) Avoid these papers.

The scientist is, by his reliance on the passive voice, hobbled, leading to sentences like this one, in which the subject, a lumpy noun, is acted upon by pallid adjectives and wan verbs, all without ever saying exactly who the action is done by, so that the sentences get longer and longer as you read and never seem to end, even when there is clearly nothing more to say in the sentence, at which point the reader sometimes gets a meager little semicolon; this gives him a rest, so that he can go on and read another long phrase without really learning anything more, because the writer's hand has kept on moving even though his brain is disengaged.
(from Cosmos Online)

And yes, I'm serious.
Yeah, this would be really cool!
Reply
#12
gdaxeman Wrote:Yeah, we want an ebook by nest0r proposing a systematic, scientifically based method on how to study East Asian languages in an optimal way! First for the community and then for the world! Cool Preferably one written in an engaging style, not like if it was written by a scientist:
I can see something like this brewing in the posts he writes. There's definitely something going on in that data center. I'm sure we'll see the scientifically-proven, most efficient way to learn come out of him one day.
Reply
#13
I just have to add my vote in there. I'd love to see an aggregate of all the interesting info posted here. No pressure Wink
Reply
#14
Haha. nest0r's writing style is confusing by original design, despite efforts to temper this over the years.

Yes, I'm working on... something. I don't know what, yet, so nest0r's in the dark on that. At the least, I'm thinking of retiring nest0r (except for tool-related comments—no, not comments about Khatzians, but about tools made by the resident programmer-gods). It becomes a process of diminishing returns at this point when I share updates since they're often so connected to ideas I've discussed before, but which are scattered.

When I do, that might give me more time to work on the RevtK wiki, or at least make arrangements so that information I've personally gathered can be more easily disseminated for integration with the other resources and perspectives generated and shared here.
Edited: 2011-08-03, 7:52 pm
Reply
#15
I think nest0r is Khatzumoto disguised. :O
Reply
#16
Sebastian Wrote:I think nest0r is Khatzumoto disguised. :O
That would explain why he quotes so many of us here...
Reply
#17
I am become the anti-Khatz.
Reply
#18
nest0r Wrote:I am become the anti-Khatz.
What would that be?

Makemoto? :o

(Brownie points for those who get the pun)
Reply
#19
Nagareboshi Wrote:Michiel Kamermans, the author of the book An Introduction to Japanese Syntax Grammar & Language, is currently looking for some help, for one of his projects. It has to do with Kanji, so i thought i ask here, if anybody would want to spend some time to help him out.

The project is about breaking down kanji, and forming new kanji with them, to generate a frequency list which will be used, for his new book. There is still much work to be done in fact. 2.000 kanji have to be processed, 1.500 are already done. You can read more about it on the project homepage. Once this is done, he will start writing a new book, which will also be free, about Vocabulary and Kanji. Sorted in a way from the most frequently used 3.500 kanji to least common ones, with sentences, and vocab and reading sections. This is the project homepage with more information, about how you can participate, if you like to.
I felt guilty for going so off topic on this thread, so went to the website and added 2 kanji to feel better.

Everybody could go and do the same, it only takes a minute.
Reply
#20
nest0r Wrote:Haha. nest0r's writing style is confusing by original design, despite efforts to temper this over the years.
^-- This. I can't understand what nest0r says half the time. XD All I know is it sounds smart, and he makes lots of references to increasingly more difficult to read things beyond his own words.
Reply