Download audio from WWWJDIC

Index » Learning resources

  • 1
 
cescoz Member
From: Italy Registered: 2008-01-22 Posts: 131

Hello guys!
there's a way to dowload all the audio words that you need in one time with a script or similar from the site?
I need all the jlpt audio words for using with anki but one by one is impossible even with an automatic grabber
the files of site are awesome because their quality is perfect and takes only some kbytes

ruiner Member
Registered: 2009-08-20 Posts: 751

Waaa, how long have they had audio up there??

Edit:  "Audio Clips for all EDICT Entries - April 2009

Audio examples for many of the 135,000+ EDICT entries, spoken by Japanese native speakers. These are linked to each entry when displayed on the screen."

(I see Tobbs answered my question already.)

http://www.csse.monash.edu.au/~jwb/wwwjdicaudio.html

Last edited by ruiner (2009 September 07, 10:46 am)

Tobberoth Member
From: Sweden Registered: 2008-08-25 Posts: 3364

ruiner wrote:

Waaa, how long have they had audio up there??

A few months. Jim Breen worked together with JapanesePod101.com to do it.

Advertising (register and sign in to hide this)
JapanesePod101 Sponsor
 
lauri_ranta Member
Registered: 2012-03-31 Posts: 139 Website

I used this shell script:

Code:

php=http://assets.languagepod101.com/dictionary/japanese/audiomp3.php
IFS=$'\n'
for l in $(sed -En 's|^(.+) \[(.+)\] .*|\1 \2|p' edict.txt); do
  curl -s "$php?kana=${l#* }&kanji=${l% *}" > "pod/$l.mp3"
done
for l in $(sed -En 's|^([^[]+) /.*|\1|p' edict.txt); do
  curl -s "$php?kana=$l" > "pod/$l.mp3"
done
for f in pod/*; do
  [[ $(stat -f %z "$f") =~ (52288|53303) ]] && rm "$f"
done

It took about two days, and it downloaded audio files for 127677 words (or readings of words).

Edit: I uploaded the files to http://jptxt.net/edict-japanesepod-audio.tar (1.8 GB).

Last edited by lauri_ranta (2013 November 30, 5:11 pm)

Reply #5 - 2013 March 06, 1:35 pm
cb4960 Member
From: Los Angeles Registered: 2007-06-22 Posts: 917

lauri_ranta wrote:

I used this shell script:

Code:

php=http://assets.languagepod101.com/dictionary/japanese/audiomp3.php
IFS=$'\n'
for l in $(sed -En 's|^(.+) \[(.+)\] .*|\1 \2|p' edict.txt); do
    curl -s "$php?kana=${l#* }&kanji=${l% *}" > "pod/$l.mp3"
done
for l in $(sed -En 's|^([^[]+) /.*|\1|p' edict.txt); do
    curl -s "$php?kana=$l" > "pod/$l.mp3"
done
for f in pod/*; do
    [[ $(stat -f %z "$f") =~ (52288|53303) ]] && rm "$f"
done

It took about two days, and it downloaded audio files for about 128000 out of the 202000 words currently in EDICT.

You can also download the files from this thread:
http://forum.koohii.com/viewtopic.php?p … 12#p126412

Should take significantly less time then the 2 days mentioned above.

  • 1