Back

Sentence Search Engine

#1
I am looking for a more efficient way to find example sentences and I was wondering if anyone knows of a search engine with more powerful search functions than that offered by JReK.

Basically, I'd like the option to search for words within a specified number of words of another word. If anyone has ever used Factiva for work or study then you'll know this is a very useful feature when trying to retrieve particular information.

This feature would be particularly useful when you're looking for example sentences with two or more words with a common theme that you would like to see used together. For example:

食堂 within n words of 食べる

津波 within n words of 避難

金融 within n words of 債務 within n words of 不履行

食中毒 within n words of 吐き気 within n words of 下痢

It would be a pretty efficient way to get more out of each example sentence. Does anyone know of any search engine where this is possible?
Reply
#2
Yes.
http://www.kotonoha.gr.jp/shonagon/

Edit: The official term for something like this is corpus by the way. 日本語 コーパス is what I put into Google to find that if you want to see what else is out there.
Edited: 2012-12-15, 9:59 pm
Reply
#3
This is really great. It's exactly what I was looking for. Thanks a lot!
Reply
May 15 - 26: Pretty Big Deal: Get 31% OFF Premium & Premium PLUS! CLICK HERE
JapanesePod101
#4
(2012-12-15, 8:12 pm)markaleksander Wrote: This feature would be particularly useful when you're looking for example sentences with two or more words with a common theme that you would like to see used together. For example:

食堂 within n words of 食べる

津波 within n words of 避難

金融 within n words of 債務 within n words of 不履行

食中毒 within n words of 吐き気 within n words of 下痢

Did you figure out how to do things like this with Kotonoha?

I guess one must do this using regular expressions, right? I don't suppose there's a page of some sample searches that you know?

Also, do you know if there's any way to search using parts of speech: something like (形容詞)リンゴを食べた? I often use searches like this in English with COCA, but not sure if this is possible in this Japanese corpus.
Reply
#5
(2012-12-15, 9:55 pm)prink Wrote: Yes.
http://www.kotonoha.gr.jp/shonagon/

Edit: The official term click here for the bathmate photos or something like this is corpus by the way. 日本語 コーパス is what I put into Google to find that if you want to see what else is out there.

Thanks Prink, that works very well.
Edited: 2017-04-06, 8:35 am
Reply
#6
I've been using http://yourei.jp/?hl=ja for example sentences. I can't see that they support any advanced search operators, though.
Reply
#7
(2016-09-04, 3:28 pm)gaiaslastlaugh Wrote: I've been using http://yourei.jp/?hl=ja for example sentences. I can't see that they support any advanced search operators, though.

Thanks for the link. I'll add it to my resource list--always nice to have multiple sources!

Mostly looking for something with advanced search operators to be able to see collocates. For example, on COCA, I can search "NOUN eats NOUN" or somesuch for those specific examples. I'm thinking Kotonoha can do this... I'm just not sure how to use it yet. And honestly, my Japanese probably isn't to the level where this is super useful anyway, but eventually I'd like a resource like this available.
Reply
#8
(2016-09-04, 9:21 pm)Earthlark Wrote:
(2016-09-04, 3:28 pm)gaiaslastlaugh Wrote: I've been using http://yourei.jp/?hl=ja for example sentences. I can't see that they support any advanced search operators, though.

Thanks for the link. I'll add it to my resource list--always nice to have multiple sources!

Mostly looking for something with advanced search operators to be able to see collocates. For example, on COCA, I can search "NOUN eats NOUN" or somesuch for those specific examples. I'm thinking Kotonoha can do this... I'm just not sure how to use it yet. And honestly, my Japanese probably isn't to the level where this is super useful anyway, but eventually I'd like a resource like this available.

It sounds like Natsume might be close to what you are looking for: http://hinoki-project.org/natsume/
Reply
#9
http://nlt.tsukuba.lagoinst.info/
NINJAL-LWP for TWC (NLT)
筑波大学の「筑波ウェブコーパス」(約11億語)
Quote:次に、特徴的なコロケーションを表示してみます。NLTでは、頻度順だけでなくMIスコアの順でも並べ替えることができます。MIスコアは統計指標の一つで、特徴的なコロケーションほど数値が高くなる傾向があります。ただし、低頻度のコロケーションの数値が過剰に高くなるため、低頻度のものを排除する必要があります。コロケーションパネルのヘッダーの[MI]をクリックしてから、パネル上で右クリックして[頻度20以上]を選びます。
Reply