By the way, I processed the complete digitized versions of Harry Potter in Spanish and French through a simple concordance and found that the Spanish version has only just over 8,000 different words (not word families) and the French has just over 9,000 different words (out of over 88,000 words in total that make up the book). Not really very many when you consider Moby dick has over 18,000 differnt words! This shows why you need such huge amounts of input before your reading starts to get much better.
2009-04-27, 10:21 pm
2009-04-27, 10:41 pm
I sent you an email about it jbudding.
2009-04-27, 11:03 pm
Though about picking up Harry Potter in Japanese but I think it's beyond my level right now, especially if there isn't any furigana on some of the kanji. I tried reading a lower level (elementary-ish) fantasy book and it blew me away with the varied vocabulary. Pretty sure I picked up a nerd book
Advertising (Register to hide)
May 16 - 30 : Pretty Big Deal: Save 31% on all Premium Subscriptions!
- Sign up here
2009-04-28, 12:04 am
I'm sure that you would be able to find a copy with furigana though, right? I asked the nice man at my local japanese magazine/book store and he said he would do some research for me and see if he can find a copy of Harry Potter with furigana. If he (and I) have any luck I'll let you know.
Edited: 2009-04-28, 12:07 am
2009-04-28, 12:15 am
i got that harry potter book...still cant read it lol
2009-04-28, 10:35 am
jbudding, how much of the book do you plan to type up? I don't really know anything about OCRs, but one was posted somewhere on this forum a while ago.
If that doesn't work maybe I could help out somehow.
If that doesn't work maybe I could help out somehow.
2009-04-28, 11:22 am
I just found out from the coscom website that they have upgraded the e-audio book version of KO to include the audio for thexample sentences! I have applied for my upgrade. If you have the KO disc and don't have the upgrade, you may want to check this out. I think the KO sentences will be much more useful as SRS material if the audio turns out to be good. I have their katakana disc and the audio versions of the katakana words are pretty good (but no sentences on this one.) They sound like a real voice, not a computer voice.
2009-04-28, 11:34 am
Wow. I wish I would have ordered the audio version instead of the physical books. Having the audio for all the example sentences would be awesome.
Edited: 2009-04-28, 11:34 am
2009-04-28, 12:26 pm
jbudding Wrote:I would love to cooperate with anyone who is interested in capturing the sentences from ハリー ポッターと賢者の石. I am using audacity to capture the audio and I just started trying to type in the text. This of course is gruelling without a digitized version, trying to figure out the reading for kanji wothout furigana is extremely time consuming! Hey Grulul, want a partner?I'm not following you... what reading for kanji do you have to figure out? Can't you just read from the book and listen to the audio? Were you referring to Kanji Odyssey or Harry Potter?
I don't think I'd be of too much help in mining HP, even though I read it in conjunction with the English version there's still some stuff that I can't make sense of and have to skip.
By the way, I don't actually own the book, I just have the e-book. So far I have only found a couple of wrong kanji and a couple of wrong kana (such as a ほ for a は).
2009-04-28, 4:16 pm
I've got more than 400 Harry Potter sentences in my deck, from the first book. Let me know if they can be of help. There's no English translation, though.
2009-04-28, 6:33 pm
I only had the hard copy of the book and the audio cd's and I found it difficult to type in the kanji characters without the furigana even though I could hear it, my ear is not that good to pick out the sounds well enough to type them in to my IME and then find the corresponding kanji. Many of the kanji in my version of the book have furigana (except on the cover) but it is still extremely time consuming. However, even more fortunately, as of this morning, like you, I have an e-version so I only have to make the sound bytes and cut and paste the e-text into anki!
2009-04-29, 12:45 am
jbudding Wrote:I just found out from the coscom website that they have upgraded the e-audio book version of KO to include the audio for thexample sentences! I have applied for my upgrade. If you have the KO disc and don't have the upgrade, you may want to check this out. I think the KO sentences will be much more useful as SRS material if the audio turns out to be good. I have their katakana disc and the audio versions of the katakana words are pretty good (but no sentences on this one.) They sound like a real voice, not a computer voice.Is this TTS audio? It sounds like it, but I can't be sure. If it's native reading, then it's an outstanding resource.
2009-04-29, 12:48 am
For me, I do not mine sentences. I'm too lazy. Every time I tried I end up quitting. For others, it's a pleasure. So I probably could not see myself mining manga, drama scripts or novels despite the benefit I'd get.
However, using pre generated sentences has been beneficial to me, so I'm not hurting too bad.
However, using pre generated sentences has been beneficial to me, so I'm not hurting too bad.
2009-04-29, 1:21 am
I'm not sure what TTS is but if it's the computer generated talking, I agree, I find that hard to listen to. I have it on the "voice of japan" electronic dictionary and I think iKnow also uses this type of audio, at least it sounds funny to me. The audio clips from coscom at least for the katakana are individual mp3 files and they sound like people talking. The audio examples on their website also sound like a recording of some live person talking. Coscom just sent me an email today confirming that they are sending me my upgrade disc by airmail postage prepaid by them. I have never heard of such a good upgrade offer! I'll let you know how it turns out.
2009-04-29, 6:46 am
Sentences are like crutches. As soon as your legs go healing, you must walk without them.
If you read and listen to more than the premade sentences, and by the way, more than the mined sentences also, you are in the right way.
It would be silly to think that you'll do good only with the sentences. Streamed text and audio are vital.
If you read and listen to more than the premade sentences, and by the way, more than the mined sentences also, you are in the right way.
It would be silly to think that you'll do good only with the sentences. Streamed text and audio are vital.
Edited: 2009-04-29, 6:46 am
2009-04-29, 7:20 am
I kind of agree with mentat_kgs. What's the point of "sentence mining" a whole novel? Read it, then read another one. The words you learned reading the first one will be there in the second one. (By the way, I hate that expression, "sentence mining", it sounds like learning with a bulldozer)
At some point there is no longer a need for SRSing everything, it's great at first. Initially learning is hard and as long as we can't just read fast enough or understand spoken Japanese easily, there no way we expose ourselves to enough Japanese.
But when you can just read a novel at a normal speed and basically review thousands of words in a few hours without even noticing it, there is no need to be afraid of forgetting things anymore.
At some point there is no longer a need for SRSing everything, it's great at first. Initially learning is hard and as long as we can't just read fast enough or understand spoken Japanese easily, there no way we expose ourselves to enough Japanese.
But when you can just read a novel at a normal speed and basically review thousands of words in a few hours without even noticing it, there is no need to be afraid of forgetting things anymore.
2009-04-29, 7:51 am
yeah, though i still suck majorly, i'd have to say i agree. that's why i decided to stop with learning resources soon. authentic reading materials, actually reading, and having fun are vital.
2009-04-29, 11:35 pm
Be wary of throwing out the baby with the bathwater. I'm using pre-made learning resources (RTK, Tae Kim, Smart.fm) to be sure. I'm put in about effectively 650 study hours with them. However, during that time I've been the the other part - listening to stuff I like, watching stuff I like and reading stuff I like.
Sentences are not crutches. Each and every sentence should be something that was not handled previously. The SRS, if used effectively, soon pushes those sentences off into the sunset. So if you view them as crutches, you get rid of them by getting the correct. If you're missing them, must still be a use for it. I don't think anyone on this forum at least advocated doing only sentences and nothing besides that (such as watching, reading, and listening to real Japanese).
As for which items to mine? Well, depends on your mentality. Guys like Alyks and Khatzumoto like to mine real Japanese material. They'll find a word, open up J-J dictionary and get a good sample sentence. For a guy like me, that's too much effort. It turns me off learning, but I'm not going to deride it as it works and works well. I use pre-generated material, then with my immersion part I don't worry about what I don't know.
As for when do you stop? Depends on the person. I can't see a reason FOR MYSELF to go past 3000 kanji, 1000 grammar sentences and 6000 vocabulary sentences. By my pace, that should add up to 1100 hours of study/review time (650 down). Couple that with a few thousand hours of listening and viewing on top of a few thousand pages of read material should put me well into the fluent stage. Someone else may want to do more studying while another will stop earlier and go with more immersion. YMMV.
Sentences are not crutches. Each and every sentence should be something that was not handled previously. The SRS, if used effectively, soon pushes those sentences off into the sunset. So if you view them as crutches, you get rid of them by getting the correct. If you're missing them, must still be a use for it. I don't think anyone on this forum at least advocated doing only sentences and nothing besides that (such as watching, reading, and listening to real Japanese).
As for which items to mine? Well, depends on your mentality. Guys like Alyks and Khatzumoto like to mine real Japanese material. They'll find a word, open up J-J dictionary and get a good sample sentence. For a guy like me, that's too much effort. It turns me off learning, but I'm not going to deride it as it works and works well. I use pre-generated material, then with my immersion part I don't worry about what I don't know.
As for when do you stop? Depends on the person. I can't see a reason FOR MYSELF to go past 3000 kanji, 1000 grammar sentences and 6000 vocabulary sentences. By my pace, that should add up to 1100 hours of study/review time (650 down). Couple that with a few thousand hours of listening and viewing on top of a few thousand pages of read material should put me well into the fluent stage. Someone else may want to do more studying while another will stop earlier and go with more immersion. YMMV.
2009-04-30, 2:00 am
Forgive me, I'm not as experienced here as some of the other posters, but I'm going to post anyway in light of Nukemarine's comments.
My opinion, if it matters, is that that SRS/sentence mining is not a path to fluency, but rather a tool that if used properly can get you there faster. Proficiency comes from doing things in your target language. Reading manga/novels, watching TV, listening to podcasts, talking to people are all things that increase your Japanese ability. However while you're still learning the language, you'll necessarily have to hit the grammar books and dictionaries quite often--what the SRS does is make sure that you'll never have to look something up twice. That's it.
Now I see nothing wrong with using premade material at first, to learn common grammar and frequent vocab/phrases/idioms. You know you're going to have to look up high frequency stuff eventually, so you might as well learn it efficiently at the get-go. But (my opinion is that) as soon as you can start reading/watching/listening native japanese media or converse, and at a pace that you can enjoy, you should start doing that and stop spending time on premade resources. SRS is a tool that makes immersion manageable, but one shouldn't forget what that tool is used for. I don't want to fall into the trap of thinking that "if I just collect 10,000 facts in my SRS, I'll finally be fluent!"
Does this ring true with anybody?
My opinion, if it matters, is that that SRS/sentence mining is not a path to fluency, but rather a tool that if used properly can get you there faster. Proficiency comes from doing things in your target language. Reading manga/novels, watching TV, listening to podcasts, talking to people are all things that increase your Japanese ability. However while you're still learning the language, you'll necessarily have to hit the grammar books and dictionaries quite often--what the SRS does is make sure that you'll never have to look something up twice. That's it.
Now I see nothing wrong with using premade material at first, to learn common grammar and frequent vocab/phrases/idioms. You know you're going to have to look up high frequency stuff eventually, so you might as well learn it efficiently at the get-go. But (my opinion is that) as soon as you can start reading/watching/listening native japanese media or converse, and at a pace that you can enjoy, you should start doing that and stop spending time on premade resources. SRS is a tool that makes immersion manageable, but one shouldn't forget what that tool is used for. I don't want to fall into the trap of thinking that "if I just collect 10,000 facts in my SRS, I'll finally be fluent!"
Does this ring true with anybody?
Edited: 2009-04-30, 2:08 am
2009-04-30, 2:42 am
So maybe I should download a pre-made deck with sentences, at least for the first 1000. What do you think? Would you be willing to share your decks Nukemarine?
2009-04-30, 2:47 am
I think I'm the middle ground between Mentat_kgs and Nukemarine, sort of. Similar to mafried I suppose.
I think SRSing is something which has a certain "prime time" where's it's insanely useful. Before that and after that, it sort of loses its impact. When you know no grammar or words at all, you can certainly use it, but not with the efficiency of someone who know tons of grammar and enough words to get i + 1 sentences which can be reviewed extremely fast with a very high retention. In the same way, someone who can review everything they need from simply exposing themselves will have more fun doing that, slowly making the SRS less effective than before.
What I suppose is that this prime time is really long though. 2-3 weeks into Japanese with proper SRSing gets you into this prime time mode, and it lasts for several years. Even after you pass JLPT1, SRS will be highly useful. The SRS won't stop getting useful until you're almost fluent and can read a Japanese book with the same ease as you read an English one. That's why I don't use an SRS for English, I already know all the words needed for normal life so I don't even know what I would put into an SRS and when I run into words I don't know, I understand them from context 99% of the time.
I don't think people should even consider stopping to use the SRS. When you're good enough, you'll stop using it automatically because you won't find anything to put in it.
I think SRSing is something which has a certain "prime time" where's it's insanely useful. Before that and after that, it sort of loses its impact. When you know no grammar or words at all, you can certainly use it, but not with the efficiency of someone who know tons of grammar and enough words to get i + 1 sentences which can be reviewed extremely fast with a very high retention. In the same way, someone who can review everything they need from simply exposing themselves will have more fun doing that, slowly making the SRS less effective than before.
What I suppose is that this prime time is really long though. 2-3 weeks into Japanese with proper SRSing gets you into this prime time mode, and it lasts for several years. Even after you pass JLPT1, SRS will be highly useful. The SRS won't stop getting useful until you're almost fluent and can read a Japanese book with the same ease as you read an English one. That's why I don't use an SRS for English, I already know all the words needed for normal life so I don't even know what I would put into an SRS and when I run into words I don't know, I understand them from context 99% of the time.
I don't think people should even consider stopping to use the SRS. When you're good enough, you'll stop using it automatically because you won't find anything to put in it.
2009-04-30, 8:17 am
Mafried, I don't think we're disagreeing here. I've posted also this is all about getting into experiencing real Japanese. The studying/reviewing helps you get more out of the time you spend experiencing real Japanese. Vocabulary is good to study. Studying vocabulary via sentences is good. Studying sentences via an SRS is good. These are not exclusionary either. You can do one while doing the other. I study about 2 hours a day, while having Japanese playing on my iPod or radio. That's still 22 hours left for more Japanese (ok, I'll sleep for 6 to 8 of those). I think the problem can be if we talk about one thing, it makes us seem we're putting that above all else.
So let me be clear: Studying will not make you fluent. Experiencing real Japanese will get you fluent.
So let me be clear: Studying will not make you fluent. Experiencing real Japanese will get you fluent.
jorgebucaran Wrote:So maybe I should download a pre-made deck with sentences, at least for the first 1000. What do you think? Would you be willing to share your decks Nukemarine?Everything I use has been put out there on the google documents. RTK, Tae Kim and iKnow sentence spreadsheets are all up and available to the public. The only thing I cannot share are all my RTK stories as I can't remember which are mine and which are KanjiCan stories.
2009-04-30, 9:42 am
To jump the topic real quick back to that discussion of the upgrade to KO2001...
Where does it say something about that? If this is going to come out real soon I think I will most definitely be purchasing it!
To input my two cents, here's what I think on the whole sentence SRSing issue.
First off, as everyone always says, just do what is fun. I happen to get great pleasure out of using KO 2001 for learning new kanji readings and compounds and then going out into the world and seeing them and thinking "Wow, I couldn't read that yesterday." KO 2001 progresses perfectly for me, in a nice logical order that makes things easy. That is why I SRS it... there is a ton of stuff there I never want to forget. Will I do the SRS for a long time? Absolutely. Especially if there is any point in my life where I can not speak Japanese every single day, or see Japanese every single day, well then the SRS will come into it's best use: helping me to remember.
As for mining from real materials... I don't see myself doing it. Sure I watch japanese shows and read Japanese materials and interact with Japanese people, but I don't SRS any of that. I have some nice textbook material that is for the SRS, which I view as my foundation. That is teaching me all the essentials so I can go out and pick up the rest by myself. That's the way I think it's suppose to work.
Anyway, please someone answer my questions about KO! I'm very excited to hear about this!
Where does it say something about that? If this is going to come out real soon I think I will most definitely be purchasing it!
To input my two cents, here's what I think on the whole sentence SRSing issue.
First off, as everyone always says, just do what is fun. I happen to get great pleasure out of using KO 2001 for learning new kanji readings and compounds and then going out into the world and seeing them and thinking "Wow, I couldn't read that yesterday." KO 2001 progresses perfectly for me, in a nice logical order that makes things easy. That is why I SRS it... there is a ton of stuff there I never want to forget. Will I do the SRS for a long time? Absolutely. Especially if there is any point in my life where I can not speak Japanese every single day, or see Japanese every single day, well then the SRS will come into it's best use: helping me to remember.
As for mining from real materials... I don't see myself doing it. Sure I watch japanese shows and read Japanese materials and interact with Japanese people, but I don't SRS any of that. I have some nice textbook material that is for the SRS, which I view as my foundation. That is teaching me all the essentials so I can go out and pick up the rest by myself. That's the way I think it's suppose to work.
Anyway, please someone answer my questions about KO! I'm very excited to hear about this!
2009-04-30, 9:52 am
There's a link in the top left yellow box on Coscom's homepage that goes here:
http://www.coscom.co.jp/japanesekanji/ka...index.html
On that page it says all the example sentences are voice-recorded.
There are more examples on the free sample pages:
http://www.coscom.co.jp/ebook/2001kanji/...1-top.html
http://www.coscom.co.jp/ebook/2001kanji/...2-top.html
(Click on a kanji, then click the Examples 1, 2, or 3 buttons)
http://www.coscom.co.jp/japanesekanji/ka...index.html
On that page it says all the example sentences are voice-recorded.
There are more examples on the free sample pages:
http://www.coscom.co.jp/ebook/2001kanji/...1-top.html
http://www.coscom.co.jp/ebook/2001kanji/...2-top.html
(Click on a kanji, then click the Examples 1, 2, or 3 buttons)
Edited: 2009-04-30, 9:53 am
2009-04-30, 12:11 pm
Nukemarine Wrote:Mafried, I don't think we're disagreeing here...I figured as much. In fact, my original post started with "Following on from Nukemarine," but I went off on a tangent and didn't want to put words in your mouth.
