Back

JDIC sound files

#7
Try this python script:

Code:
import urllib, codecs

URL = u"http://assets.languagepod101.com/dictionary/japanese/audiomp3.php?kana=%(reading)s&kanji=%(kanji)s"

class Word:
    def __init__(self, reading=u"", kanji=u""):
        self.reading = reading
        self.kanji = kanji

class Ripper:
    def __init__(self):
        self.words = []
        
    def load(self, filename): # Load a list from file
        file = codecs.open(filename, "rb", "utf-8")
        for line in file:
            word = line.split("\t")
            self.words.append(Word(word[0], word[1]))
        
        print("Loaded list at: ", filename)
    
    def download(self):
        for word in self.words:
            # Try and download this word
            webFile = urllib.urlopen((URL % {u'reading': word.reading, u'kanji': word.kanji}).encode("utf-8"))
            file = open(word.kanji + u".mpga", "wb")
            file.write(webFile.read())
            file.close()
            webFile.close()
            
            print "Downloaded " + word.kanji

if __name__ == "__main__":
    ripper = Ripper()
    ripper.load("words.txt")
    ripper.download()
Just put a file called words.txt with the reading and then the kanji with tab inbetween on each line.

Though, for some reason when I download the audio and I try to play it, the file is silent... Even when I use a browser...

It's probably me who's messed up some sort of codec, so you might be able to use this.
Reply

Messages In This Thread