ひそひそ話: Perceptual cues for decoding Japanese pitch accent

Index » The Japanese language

  • 1
 
Reply #1 - 2013 March 31, 5:27 pm
mmhorii Member
From: SoCal(tech) Registered: 2009-07-28 Posts: 106

What perceptual cues do people use when decoding Japanese pitch accent? If the primary acoustic correlate is the fundamental frequency of speech, what happens when the fundamental frequency is replaced by noise?

The following paper, Hearing pitch despite its absence in "whispered" speech, presented at the 162nd Acoustical Society of America Meeting in 2011, investigates this question. Replacing the fundamental frequency from speech with noise results in audio reminiscent of whispered speech. Listeners can nevertheless perceive pitch accent in this “whispered” speech at better-than-chance levels. This finding suggests that there are as-yet-unidentified, weaker secondary acoustic cues that encode Japanese pitch accent.

Reply #2 - 2013 March 31, 6:00 pm
dizmox Member
Registered: 2007-08-11 Posts: 1149

I'm not sure what more one could say other than that the sound wave produced (whether whispered or spoken) has a different distribution of frequencies depending on mouth shape and tension/shape of the vocal chords. I guess you could easily do a spectrum analysis to show see the differences, but I don't think that would provide any insight on how to help beginners.

Last edited by dizmox (2013 March 31, 6:58 pm)

Reply #3 - 2013 March 31, 7:25 pm
mmhorii Member
From: SoCal(tech) Registered: 2009-07-28 Posts: 106

Who said anything about helping beginners? smile

Yeah, it's possible that people can tell the difference between

  kare wa tori ga ii (accent on ri)

and

  kare wa tori ga ii

from harmonics of the fundamental frequency that was removed. It's hard for me to tell the difference. I was curious if it was difficult for other people to discern a difference between the audio samples in the link above.

Advertising (register and sign in to hide this)
JapanesePod101 Sponsor
 
Reply #4 - 2013 March 31, 7:44 pm
dizmox Member
Registered: 2007-08-11 Posts: 1149

Yeah, I can quite distinctively, the first sounds "higher pitched", so I guess there's more comparatively more amplitude in the higher harmonics? 

I guess there could be some minor differences in timing/overall amplitude too.

Last edited by dizmox (2013 March 31, 8:23 pm)

Reply #5 - 2013 March 31, 8:20 pm
fakewookie Member
From: London Registered: 2010-08-02 Posts: 362

I think I'm going to have nightmares.

Reply #6 - 2013 March 31, 8:27 pm
mmhorii Member
From: SoCal(tech) Registered: 2009-07-28 Posts: 106

@dizmox
You have great ears! Given that the accuracy rate in the experiment was 65%, and given that 21 Japanese-speaking people (I'm assuming they were native speakers) were given the test, it seems like some of the people couldn't hear the differences. I guess some people are missing the FFT module in their brains.

Reply #7 - 2013 March 31, 8:31 pm
dizmox Member
Registered: 2007-08-11 Posts: 1149

I could just be fooling myself, since I knew the differences beforehand. 8-) I might be picking up other subtle audial clues (other than frequency) that I can't pick out consciously, cuing my brain to fill in the missing frequency information.

Last edited by dizmox (2013 March 31, 8:36 pm)

Reply #8 - 2013 March 31, 8:51 pm
mmhorii Member
From: SoCal(tech) Registered: 2009-07-28 Posts: 106

@fakewookie
Ever hear of the experiment that Pavlov performed on dogs to induce neurosis? He taught dogs to discriminate between a circle and a square shape. He then slowly made the shapes look more and more alike, until it was really difficult to tell the shapes apart. Needless to say, the dogs were very unhappy.

Welcome to your new neurosis.

Reply #9 - 2013 March 31, 8:53 pm
Irixmark Member
From: 加奈陀 Registered: 2005-12-04 Posts: 291

I can hear the difference quite clearly, although again I knew what to listen for.

I know nothing about vocal sound production and acoustics, but e.g. when you record musical instruments, you can "compress" the sound so that there are objectively no differences in volume (i.e. level of noise/sound produced), but pretty much anybody, musician or not, can hear what a "quiet part" is, or when there are dynamics, i.e. when the music goes from quiet to loud etc. How the instruments sound is strongly associated with a perceived level, but the brain is fooling itself into hearing the difference... perhaps something similar is at work here. Must be, though. Can you imagine you couldn't whisper in a tonal language?

Reply #10 - 2013 March 31, 10:56 pm
mmhorii Member
From: SoCal(tech) Registered: 2009-07-28 Posts: 106

When the signal-to-noise ratio drops, it makes sense that the listener uses whatever remaining cues available, and fills in perceptual gaps using context. For example, we have the McGurk Effect where lip-reading influences our auditory perceptions.

I also came across this link, that mentions a study that posits some compensatory mechanisms used when whispering in Chinese:
http://www.sinosplice.com/life/archives … stop-tones

  • 1