Back

Sentence mining - collections?

#1
After reading about the 10,000 sentences method the full immersion idea stuck me as brilliant; especially after coming in contact with a similar concept while I was researching lojban.

However, while I do not want to get off as someone lazy, the idea of mining for sentences strikes me with dread.

Quite simply, I don't have the time! With all the commitments I already have, I want to spend those precious hours I allocate on studying Japanese daily on actually studying the language, not searching for sentences that I could potentially study sometime in the future.

Considering that a few other people on this board are in a similar situation, here is what I wonder:

1) Is there an existing compilation of a few thousand sentences that I could use?
2) Would someone be willing to export the sentences from their SRS program and share them? (I would be vary grateful!)
3) If the above do not work out, would someone be interested in collaborating on creating a library of such sentences for all to use?

Hope to hear some replies!
Reply
#2
Being a tyro at japanese I'll have to speak on my experience in studying spanish. I believe a lot of the benefit of a method like 10,000 sentences comes from the looking up of the words and finding them. It makes it stick better than just getting the meaning without looking them up. Now what this has to do with you is this; I would suggest trying to find a compilation of sentences but don't miss out on the experience of looking them up. However if someone else disagrees with me I'd suggest listening to them since I don't have experience with actually doing this. Also I suggest news sites and textbooks as a good place to mine for sentences as I often see other talking about getting them from there.
Reply
#3
I do see your point.

I agree that there is logic in your words since even when going through RTK I input all of the kanji into Supermemo myself, instead of just downloading a ready-made compilation.

Can other people comment on this? Is it essential to the process that one finds all of the sentences on their own, or would a pre-made collection suffice?
Reply
May 16 - 30 : Pretty Big Deal: Save 31% on all Premium Subscriptions! - Sign up here
JapanesePod101
#4
jisho.org gives sentences from the tanaka corpus.
Reply
#5
Tanaka corpus = bad.

Hudzon, what do you consider "studying"? You say you want to spend your time "studying" and not "searching for sentences", but ideally you should be pouring over lots of text everyday and then taking things from what you saw that day and putting them into the SRS to reinforce. The sentence and SRS method isn't a shortcut to learning; it's a way of reinforcing learning and developing a more natural "feel" for the language.

Also, everybody's levels are different. So an already-made list might not help you so much.

If you want to use sentences, just find Japanese stuff around your level you can use. Read it, learn it, take the sentences and put them into the SRS.

(That being said, you might want to check out the KO2001 thread on these boards and consider buying KO2001. If you're more of a higher intermediate, the Kanji In Context series might be better.)
Reply
#6
Tanaka corpus is a poor resource. The style of Japanese that it uses can be very stilted and unnatural. Try to mine sentences from primary sources like real books/magazines/movies/manga/audio clips. I wouldn't even trust a textbook, in this regard. The sentences are modified too much to conform to vocabulary being taught in whatever chapter they're included in, rather than to conform to the natural way Japanese is read or spoken.
Reply
#7
I'm not sure what you mean by 'actually studying'. I'm under the impression that the idea is largely to read books/manga/articles/ watch tv/movies/blah blah and take sentences from them, look up the words, and put the sentences in an SRS. So the entire time, you're either reading/listening to Japanese or looking up Japanese words. Which part of this ISN'T studying? The goal is to be able to read/listen/speak Japanese, so you should be reading and listening anyway. This is a method to get more out of that time. You can study with a textbook or whatever too, but geez, don't shirk the actual USE of the language you're trying to learn. A surgeon doesn't pass up the cadaver because he could be spending time 'actually' studying...

Otherwise, well. A lot of people seem to be doing some sort of pre-made set, so I assume they have success with it, but I'm not sure I get it. I customize my sentence mining to sentences I find interesting and/or would like to be able to say. Getting a bunch of textbook examples or something seems like it would result in either a lot of (personally) boring sentences, a lot of stiff/textbooky sentences, or both. Just grab sentences out of that manga/book/game/article you're reading as you go along. It doesn't actually take that much time, not when you have to spend so much time looking up words while you're reading *anyway*. And it's actually pretty fun. Whenever I see a sentence I really like, I have a sort of COMPULSION to SRS it, just because the sentence was so fun and I don't want to forget it.
Beyond that, I think it's also useful to find them yourself because of how your mind works. You can drill sentences in an SRS, and you'll get them, of course, but I think it's going to be slower, because you got them out of context in the first place. When you see a sentence you handpicked yourself, you see it and you think "Oh, I remember that line from such-and-such," which is a built-in association. Associations are how we remember things, so hurray. Further, you'll likely enough remember what situation it was said in, what character (if this was from a fiction) said it, and etc, so you get further context every time you read it on what it 'sounds' like, what saying that kind of thing says about the person saying it. These sentences were said by a tough guy. These kind of sentences were only said by women. These sentences were said by children. These sentences show up in essay writing. You don't even have to think about it consciously, it's just there, every time you see the sentence, letting you know when and who certain expressions are appropriate for. You'll get all of this eventually from reading anyway, but the point of this is to speed the process up, and you won't get ANY of that from premade stocks. Maybe the sentences seem to have some connotations to you, but how would you know? You don't actually know who said it. It's just a sentence floating out there :\

2 cents!

Edit: I was beaten, and beaten more succinctly! So hey, what they said.
Edited: 2008-05-11, 9:47 pm
Reply
#8
Perhaps instead of searching for lots and lots of sentences to use in the future, maybe only focus on those that you'll learn in the next 3-4 days or so, gathering them from textbooks, phrase books, and sources that you'd like to be able to read from (like all-Japanese websites), and pick them in themed and relevant groups. I'm only doing Heisig's book now (up to 508 kanji), but I think I'll be doing the sentences in the way I've just described when I get to testing myself on them with an SRS like Anki.
Reply
#9
QuackingShoe Wrote:I'
Getting a bunch of textbook examples or something seems like it would result in either a lot of (personally) boring sentences, a lot of stiff/textbooky sentences, or both. Just grab sentences out of that manga/book/game/article you're reading as you go along. It doesn't actually take that much time, not when you have to spend so much time looking up words while you're reading *anyway*. And it's actually pretty fun. Whenever I see a sentence I really like, I have a sort of COMPULSION to SRS it, just because the sentence was so fun and I don't want to forget it.
Which is great, but I think a beginner wouldn't be able to handle most real books, manga, etc. RTK helps with kanji which helps with vocab, but that's not what the language is solely made of. Looking up new words is easy, but what about the grammar, idioms, even word order... While textbooks are stiff and different form real Japanese, I think it's better to get a foundation(JLPT3ish) in the language before you jump into native level stuff(for study and not just fun/immersion). Just my thoughts.


Anyway, I think sharing sentences is a nice idea. I also like the idea of a reading club...that would be awesome.
Edited: 2008-05-12, 2:08 am
Reply
#10
http://www.jlptstudy.com/4/index.html

Click on the grammar section. Each year has 40 sentences with a translation (that's correct 99% of the time. The sentences are so simple though it's easy to see where somethings been left out of the translation or inferred).

The grammar's pretty simple but there's a good range of vocab in there. I use Rikai-chan to work out the kanji for words I don't know and kanji-ify it as I go.
Reply
#11
I have no problem sharing my collection, anyone interested??

700 sentences, and going to add about 100 more tonight (overdue, lol)
Reply
#12
Hudzon Wrote:1) Is there an existing compilation of a few thousand sentences that I could use?


Hope to hear some replies!
try the kanji.odessy.2001 thread on this site!!
Reply
#13
You can also check the "JFE" thread. JFE stands for Japanese For Everyone and is a very good text book. In this thread, you'll find a link to an excel file containing approximately 1500 sentences taken from this text book. The file was compiled by leosmith (thanks leo!!) and for me - a user of this text book - it is really useful.
Reply
#14
I personally don't see a problem with a ready made collection of sentences that

A: Build up your grammar, basic vocabulary and kanji readings

Such a source can be UBJG (about 2000 sentences), Kodansha's All About Particles (650 sentences), and/or Handbook of Japanese Verbs (550 sentences) and/or Handbook of Japanese Adjectives and Adverbs (150 sentences Adj., 400 Adverbs). You'll have the book, you'll still need to go through and add definitions (likely J/E), you still have to review via Anki.

B: Build up your KANJI

Such a source is KO2001 and Kanji in Context.

It's after all that that a person can go about building up their own sentences. By that point, it's going to be in areas that have to interest you. I honestly believe though that the basic sentences (2000 for grammar with added 3000 for Kanji) can be premade. It's still going to require TONS of effort to honestly utilize them. Why give a man grief that he wants to save the 200 hours it would take typing all that information in manually? He's still going to be writing it all out eventually, MULTIPLE TIMES!

I think though, the key is to find a GOOD set of pre-made sentences. The above books are pretty damn good sources. Likely, they will all be in usable spreadsheet format in the near future. We just have to make damn sure that for as long as possible only those that bought the books get to share the wealth.
Reply
#15
QuackingShoe Wrote:Edit: I was beaten, and beaten more succinctly! So hey, what they said.
You may have been beaten to it, but your reply was the one that convinced me to start SRS'ing stuff from Mangas. I've been considering SRS'ing for a while, but couldn't really see where to start from... Starting from Mangas that I want to read anyhow should be a big help, there.

Thank you.
Reply
#16
I am in the process of collecting all the sentences from ADBJG (A dictionary of basic Japanese Grammar). If you have access to Adobe Acrobat, (not just the reader), I can share the scanned pdf of the book with you, then you just have to copy and paste what you want. Failing that, when I am done I don't mind sharing, I am putting them into a spreadsheet as I go for this purpose. I wholeheartedly agree that typing the sentences is a chore, I want to get straight into learning them.
Reply
#17
I would suggest both JFE and 2001.Kanji.Odyssey. Between the two of them you learn lots of vocab, grammar, and around 4500 sentences that increasing in difficulty. They are a great start I think.
Reply
#18
Here's what I did:

1. I collected sentences from Tae Kim

2. If I was confused about a word, I would simply google that word. Then I would scope out sentences which contain that word, and 1 or 2 other new words. I would put that into my SRS

3. I used http://www.alc.co.jp very often. It's a very good site and remember if you don't think it's a native like sentence, google it and find out. I used it if I want to find more sentences that use a certain word.

4. Drilling the sentences from Japanesepod101. I'm talking about practically getting most of the sentences from each lessons.

I went from not talking Japanese with my language partners, and suddenly speaking however little. What I like from japanesepod101 the most was how it improved my grammar. If you look at my other posts, you can see how happy I am about the j pod.

5. Stuff I listened to in songs. I would look up the lyrics, and then get stuff from there

I really believe my method is good because you are learning sentences through a visual route and also through an auditory route(music/tv/j pod etc). So if you were boxer fighting another boxer(the Japanese language), you are essentially hitting it at all angles.
Reply
#19
Mighty sorry for suggesting anything to do with the tanaka corpus, I used it very heavily at the beginning of my studies and it had no effect on me because i started using it in the knowledge that it was quite bad (overuse of pronouns etc.) but there are quite a few good sentences in it, I used it for learning JLPT3 vocabulary and it certainly helped Smile
Reply
#20
Since it seems quite a few people want the pdf of ADBJG. I`ll put it up on my website later when I get home from work, instead of me having to email if each time (its quite a big pdf)
Reply
#21
I've just started trying out the 万 sentence thing using KhatzuMemo since I don't have another SRS (although I'm trying to write my own no frills one).

As for mining, I don't think grabbing random sentences prepared by someone else would work for me. I'm finding I have to carefully select the sentences, ones that are at the right level of difficulty to push my limits but not flummox me. So whenever I encounter a good sentence, I write it on a paper list I have handy and when I next log on, I add the list. I only have 124 in my database after about a week, but as long as I'm making progress I'm happy.

I'm reading through an 旺文社 dictionary and adding sample sentences from that. There's a good mixture of styles from colloquial to literary, although perhaps an overuse of pronouns, but actually that's helpful because I learn how to use those parts of speech if I need them. Japanese is easy in that you can just leave out bits that are understood by context.

The other thing I've been doing is entering quizzes and drills into the SRS, not just sentences. E.g.,

question: 「消える」「消す」を入れなさい。 私は電灯を?ました。 雷で電灯が?ました。
answer: 私は 電灯を 消(け)しました。 雷で 電灯が 消(き)えました

question: きょうは四月1?、2?、3?、4?、5?、6?、7?、8?、9?、10?だ。
answer: ついたち/いちじつ、ふつか、みっか、よっか、いつか、むいか、なのか、ようか、ここのか、とおか

question: 「いる」 尊敬(そんけい)語?
answer: いらっしゃる
Reply
#22
people for some reason just don't want to give their decks. Either they think their decks are filled with mistakes or they don't want to give away their hard work I suppose. I already tried getting a deck on these forums and had no results.

Personally I don't even think it's worth bothering (at least at my level) to mine sentences, it's far too time consuming. I believe it is far more time productive to actually just read Japanese stuff like novels than to spend half my time hunting...
Reply
#23
I posted my deck when it was at about 4000 cards, its here http://www.glowingfaceman.com/2008/02/fo...cards.html, but that's really mediocre compared to what it's like now (I'm constantly revising cards to make them better), I might post it again when I hit 10,000 (at 7338 now).

HOWEVER..... the big problem with using someone else's deck is this.....

when I add a word to my lexicon, I search for sentences using just that word and others I already know. So if I already know 学校 and 歩く, and I want to learn 学生, then 学生は学校に歩いた is a good sentence. But if YOU don't know 学校 and 歩く, then the card suddenly has three unknowns. And if I know a word particularly well, I won't even bother putting the reading on the "answer" side.

As for sentence mining. It's not hard once you have the first thousand or so words, which you get somehow or other. Yahoo辞書 has plenty of sentences, in the "green" dictionary. Another great sentence source is Tae Kim, you should definitely read Tae Kim anyway since it's the grammar gospel. At the point I'm at, sentence mining is usually very easy, unless the word is really obscure. It'll be hard for you at first, because so many sentences contain so many unknown words. It'll get easier with time, though.

If you watch any japanese drama, anime, movies, podcasts, etc., you'll naturally pick up vocabulary over time. Even if you do it with English subtitles. That should help quite a lot.
Edited: 2008-05-18, 5:05 am
Reply
#24
My deck only has like 300 sentences, I don't know how you guys found the time to input 7000!

Also most of my sentences are from copyrighted works, so I probably shouldn't post them on the interweb.
Reply
#25
hey glowing face man, how long do you spend in a day reviewing when you have 7000+ cards in your deck?
(PS your blog is indeed badass)
Reply