Back

Make your own Pimsleur courses using this great program!

#1
Some people like Pimsleur courses, others (including me) think they are boring. But the idea itself is very effective. This program uses the "graduated-interval recall" method published by Pimsleur in 1967. It's like audio flashcards that appear in a special pattern designed to help you remember. Pimsleur courses courses use several techniques (they say some are patented), but this particular 1967 idea is now in the public domain so this program can use it to help you learn your own choice of vocabulary or sentences.


[Image: pimsleurmaker.jpg]


It will let you make your own version of spaced repetition audio similar to what Pimsleur does. The program lets you play audio snippets (either words or phrases, it doesn't matter) at about the intervals Pimsleur published in his paper "A Memory Schedule" of: 5 seconds, 25 seconds, 2 minutes, 10 minutes, 1 hour, 5 hours, 1 day, 5 days, 25 days, 4 months, 2 years.

Of course using this program to make your own Pimsleur lessons is not as easy as just popping a Pimsleur CD into your player. The program does take some time to understand how to operate, especially for content entry. As your first step you need content. You'll need to decide what sentences you want to learn, and how many in each lesson. The latter point is worth emphasizing. If you've felt Pimsleur doesn't go fast enough now you can choose how many new items to introduce per lesson. You can come up with the sentences (or words) on your own, perhaps decide to use ones from an existing program such as KO2001, take them out of textbooks, etc. If you've found content that you've always wished was available in the Pimsleur instructional format of spaced repetition, now you can make it happen!

Next you need to get your sentences into sound files. The program has a text-to-speech synthesizer built in, but it doesn't sound that great. So you'll probably want to either find a friendly native speaker to record the sentences for you, record them yourself in your voice, or extract them from the other audio source if that is the base of your lessons.

At this point the program can play your lessons directly, or can create them in exportable format such as MP3 so that you can carry them around with you just like the ordinary Pimsleur programs.


This program gives only audio, so you concentrate on pronunciation (so you can listen during daily routines e.g. washing etc, since you don't need to look at the screen.)

You can add words to your collection at any time, and this program can manage collections of thousands of words. It can also help you rehearse longer texts such as poems.

If possible, prepare some audio prompts such as "say again" and "do you remember how to say". These can be real recordings or synthesized text.
Reply
#2
What is the name of the program (did I miss it?).

Anyway, that's a pretty neat idea.
Reply
#3
Welcome cb4960 to my thread Smile

This program is freeware, but I became afraid of posting any links! I became unable to distinguish between the legal and the illegal!

I'll send you its name and the link to your e-mail.

But you are one of the greatest programmers here, and I don't think that you need a simple program like this Tongue
Edited: 2009-06-15, 7:09 pm
Reply
May 16 - 30 : Pretty Big Deal: Save 31% on all Premium Subscriptions! - Sign up here
JapanesePod101
#4
Looks like Gradint. I think the author says on his website it's freeware, but I forgot. I was thinking maybe to use it with stripped audio from tv/movies, but I'm far from being at the point it would be useful to me.

Is it fairly easy to use?
Reply
#5
5 words in 30 minutes?
Reply
#6
kanjiwarrior Wrote:I think the author says on his website it's freeware
Yes, I said before it's freeware.


kanjiwarrior Wrote:Is it fairly easy to use?
No, but it's not too complicated.


vosmiura Wrote:5 words in 30 minutes?
It usually tests old words before starting any new ones, to get you comfortable first.

Except in lesson 1, it must start with new ones because there aren't any old ones.

There are many other 'old' words later in the lesson. Only 5 new words, but lots of revision of old words. (you can increase the number of new words if you want.)

In lesson 1, this is impossible, so there are (unfortunately) gaps of silence. I don't know what to do about this without breaking the special gradauted-interval pattern. Thankfully it's only a small problem because only lesson 1 is affected. If a user can be persuaded to bear with it for a while, it will get better later.

To see what kind of thing it is doing, have a look at the technical graph on the web page. Look at it full screen if possible. It shows 5 new words with lots of older words in between.
Reply
#7
Hmm, just looking at the set-up, one can make shadowing MP3's with this. You'll just have your clip play twice as the "question" and the "answer" which allows a the same spacing for both.
Reply
#8
Although I haven't actually used it, this program has inspired me to write a similar, but more straight forward program not based on the so-called Pimsluer algorithm (or any other type of scheduling for that matter) but more geared toward shadowing and question-answer approaches. Lesson format will be left up to the user, but preset lesson formats will be available (see below for examples of lesson formats). I also want the process of creating a lesson to be as automated as possible, allowing for easy use of things like Anki export files without needing to edit the file itself, rather allowing the user to extract just the needed parts. So if an a file contained a field like "[soundConfusedome_audio_file.mp3]" it should be easy to extract just the "some_audio_file.mp3" part by allowing the user to remove the "[sound:" and the trailing "]" from that entire column. Then later the user could optionally specify the loacation of said audio files if needed. All data will be entered into an Excel-like grid control to allow the audio to add/remove/move/shuffle the data to suit individual needs. Naturally, the user can manually enter data using the same control. The program will utilize TTS should the user not have a "real" audio source to work with. TTS will also be used to create the silence (allowing the user to shadow or say the answer). Silence will be based on the length of the previous line multiplied by a user defined value with optional minumum length. The user should be able to preview all or part of a lesson from the grid interface. As far as output goes, the user can choose to either group the lesson into one or multiple audio files based on some number of minutes or some number of lines. The project will employ mp3wrap (for merging mp3 files), ffmpeg (for audio format conversion), BASS (for audio playback), and SAPI (for TTS).

Example lesson format 1 (shadowing-type):
(1) Audio from target langauge (using provided audio file or TTS if file not provided)
(2) Silence (based on duration of (1) * multiplier)
(3) Audio from target langauge
(4) Silence (based on duration of (3) * multiplier)

Example lesson format 2 (Pimsluer-style):
(1) "Say" (narrator voice using TTS)
(2) Audio from native language
(3) Silence
(4) Audio from target langauge
(5) Silence
(6) Audio from target langauge
(7) Silence

Example lesson format 3 (Q & A):
(1) Question audio
(2) Silence
(3) Answer audio
(4) Silence

I should have the initial version ready sometime this weekend.
Edited: 2009-06-16, 12:20 pm
Reply
#9
You might want to look at the custom podcast feature at smart.fm It simply organizes the lessons studied on smart.fm into podcasts, repeating first the vocabulary word, followed by the sentence. Podcasts automatically export to itunes.

I use it, but it is not perfect for shadowing. It would be ideal if it repeated the sentences once and the silence between sentences were extended by one second.

With the program you are writing, I imagine smart.fm audio files would be an ideal source for ready made content.
Reply
#10
I've been busy as hell recently, but there's a new version of Sheets 2 Shadowing available here:

http://intheshiz.net/post/2009/06/17/S2S...pdate.aspx

This will let you create shadowing audio from any anki exported deck.

It's pretty much already does/will do what cb4960 has posted above.

Edit: Smart.fm import is planned.
Edited: 2009-06-16, 9:32 pm
Reply
#11
mistamark Wrote:I've been busy as hell recently, but there's a new version of Sheets 2 Shadowing available here:

http://intheshiz.net/post/2009/06/17/S2S...pdate.aspx

This will let you create shadowing audio from any anki exported deck.

It's pretty much already does/will do what cb4960 has posted above.

Edit: Smart.fm import is planned.
I tried this with my anki exported deck but it asserts when loading the MP3s.
Reply
#12
cb4960 Wrote:Although I haven't actually used it, this program has inspired me to write a similar, but more straight forward program not based on the so-called Pimsluer algorithm (or any other type of scheduling for that matter) but more geared toward shadowing and question-answer approaches. Lesson format will be left up to the user, but preset lesson formats will be available (see below for examples of lesson formats). I also want the process of creating a lesson to be as automated as possible, allowing for easy use of things like Anki export files without needing to edit the file itself, rather allowing the user to extract just the needed parts. So if an a file contained a field like "[soundConfusedome_audio_file.mp3]" it should be easy to extract just the "some_audio_file.mp3" part by allowing the user to remove the "[sound:" and the trailing "]" from that entire column. Then later the user could optionally specify the loacation of said audio files if needed. All data will be entered into an Excel-like grid control to allow the audio to add/remove/move/shuffle the data to suit individual needs. Naturally, the user can manually enter data using the same control. The program will utilize TTS should the user not have a "real" audio source to work with. TTS will also be used to create the silence (allowing the user to shadow or say the answer). Silence will be based on the length of the previous line multiplied by a user defined value with optional minumum length. The user should be able to preview all or part of a lesson from the grid interface. As far as output goes, the user can choose to either group the lesson into one or multiple audio files based on some number of minutes or some number of lines. The project will employ mp3wrap (for merging mp3 files), ffmpeg (for audio format conversion), BASS (for audio playback), and SAPI (for TTS).
Great idea!


mistamark Wrote:I've been busy as hell recently, but there's a new version of Sheets 2 Shadowing available here:
Yes, mistamark. S2Shadowing is very good. I like it.

But I think cb4960's idea is to make the lesson format left up to the user. You can choose Shadowing, Pimsleur, Q&A audio flashcards, etc.
Reply
#13
vosmiura Wrote:
mistamark Wrote:I've been busy as hell recently, but there's a new version of Sheets 2 Shadowing available here:

http://intheshiz.net/post/2009/06/17/S2S...pdate.aspx

Edit: Smart.fm import is planned.
I tried this with my anki exported deck but it asserts when loading the MP3s.
Hey, thanks for trying it out!

I think that was a problem with the last version. Just to clarify, are you using the latest version? You may have to uninstall any previous version before you install this version.
When you load up the new version you should see a nice graphic on the main screen...

EDIT: I've fixed the previous version problem, the installer should now update any existing installation.
Edited: 2009-06-16, 10:02 pm
Reply
#14
ahibba Wrote:
cb4960 Wrote:Although I haven't actually used it, this program has inspired me to write a similar, but (...) back), and SAPI (for TTS).
Great idea!

mistamark Wrote:I've been busy as hell recently, but there's a new version of Sheets 2 Shadowing available here:
Yes, mistamark. S2Shadowing is very good. I like it.

But I think cb4960's idea is to make the lesson format left up to the user. You can choose Shadowing, Pimsleur, Q&A audio flashcards, etc.
Ahibba, re-reading cb4960's post, I think cb4960 and I had the same thoughts when we saw that app! I've got an Podcast implemented audio based SRS under development at the moment that will interface with Anki that will hopefully be an improvement on just the static/user controlled making of pimsleur-esq audio.
Reply
#15
mistamark Wrote:
vosmiura Wrote:
mistamark Wrote:I've been busy as hell recently, but there's a new version of Sheets 2 Shadowing available here:

http://intheshiz.net/post/2009/06/17/S2S...pdate.aspx

Edit: Smart.fm import is planned.
I tried this with my anki exported deck but it asserts when loading the MP3s.
Hey, thanks for trying it out!

I think that was a problem with the last version. Just to clarify, are you using the latest version? You may have to uninstall any previous version before you install this version.
When you load up the new version you should see a nice graphic on the main screen...

EDIT: I've fixed the previous version problem, the installer should now update any existing installation.
It was the first time I tried your tool, and I got a nice graphic of some waves on the main screen.

I'll try it again in a bit.
Reply
#16
vosmiura Wrote:It was the first time I tried your tool, and I got a nice graphic of some waves on the main screen.

I'll try it again in a bit.
If uploaded a new version, this one will handle non existent MP3s in the import. I think that might have been the problem..

http://www.intheshiz.net/file.axd?file=2...dowing.exe
Reply
#17
Thanks. There is also an issue, if one of the rows in the file has some empty columns it will not read the file.
Reply
#18
I wonder if these programs can follow a spaced repetition scheme? What I mean is, normal shadowing file like cb4960 and mistamark are talking about and have created will be audio equivalent of normal flashcards. You go through them all at the same amount.

The Grandint program does spaced repetition, but it seems to want to limit new material and force it in prior to repetition. Maybe a better algorithm is new material is introduced whenever 10 seconds of blank space is there. That new material is then spaced out. Not sure how it would work. Heck, an algorithm may not even be needed, just as standard spacing routine for x number of soundfiles:

Ex (assuming each number is 10 seconds): 01, 02, 03, 01, 02, 03, 04, 05, 06, 04, 01, 02, 03, 05, 06, 07, etc.
Edited: 2009-06-17, 7:44 am
Reply
#19
louischa Wrote:Hi,

Just finished RTK1 and I am now considering making my own mp3s of vocabulary. My first step in Nihongo was with Pimsleur (which I really liked, but which do not take you very far in the language), and I just love to listen to mp3s whenever I step out for exercise. I find it a great way to practice production.

I noticed all the links in this thread are dead. Anyone can help? Any software available, or must I record my own files?

Cheers from Montreal (-30 C today!)
Maybe Audio Lesson Studio is what you're looking for. I wrote it a while back in response to this thread.
Reply
#20
The problem I have with GradInt is, I get an error message "There are no words to put in the lesson. Please add some words first" when I actually added them to the vocab.txt
I can restart the program as often as I want and delete, then save, then add, then save again, GradInt recognizes them only the very first time and then never again...

It also opens up with "Error in advanced.txt (SyntaxError: invalid syntax (Line 217) at line 419). Warning: No speech synthesizer installed for language 'jp' (did you read ALL the comments in vocab.txt?)"

I find the program very irritating and the UI is kind of killing me. I'd love to make audio SRS lessons, but I just can't get it done..
Edited: 2012-02-24, 3:59 pm
Reply
#21
I've been using Gradint for a while now (274th lesson today). I didn't add the words to vocab.txt, I created a whole bunch of files in the "samples" directory, one each for English (text, handled by the speech synthesiser ) and Japanese (MP3).

The files and text were all exported from Anki decks.

So my samples directory looks something like this:

Code:
samples
|-- TaeKim
|   |-- 00 JapaneseClasses
|   |   `-- JapaneseClasses
|   |       `-- 0Vocab
|   |           |-- JapaneseClasses_0010_en.txt
|   |           |-- JapaneseClasses_0010_jp.mp3
|   |           |-- JapaneseClasses_0012_en.txt
|   |           |-- JapaneseClasses_0012_jp.mp3
|   |           |-- JapaneseClasses_0038_en.txt
|   |           |-- JapaneseClasses_0038_jp.mp3
|   |           |-- JapaneseClasses_0040_en.txt
|   |           |-- JapaneseClasses_0040_jp.mp3
It's all under a "TaeKim" directory because I originally used my Tae Kim grammar deck for the source material, and added in vocab from my Core6K deck (so I would get the vocab words before the sentence) and then I started some Japanese classes, so I made a directory that would come before all the other stuff for the vocab from the classes... I wouldn't recommend using my ordering Smile

BUT what you should take from that is that if you put files in your samples directory that are named the same except they end in _en.txt or _en.mp3 for the English and _jp.mp3 for the Japanese (and _jp.txt if you have a working TTS for Japanese), Gradint will pick them up and put them in your lessons.

I am currently working on writing an Anki plugin that will convert Anki cards into Gradint samples, and will also monitor the Gradint progress.txt file and automatically add Anki cards into rotation once the Gradint card gets to a certain number of repetitions.

As for the error, you're getting that because you don't have a speech synthesizer set up for Japanese - at least you don't have it set up so Gradint knows about it. Check vocab.txt and advanced.txt for information about that. I could never find a working Japanese speech synthesizer, so I can't help! I just used mp3 files from the Anki decks...
Reply
#22
This looks great Smile I can't wait for that ANKI plugin.
Reply
#23
Well... I'm new to Python coding (although I've coded in Many Other Languages for decades), I'm very new to anki plugin coding (and there's not much documentation), and I don't get a lot of time for coding on personal projects...

...so you may have to wait Smile

Add to that I will probably have to work around the messy solution I already have in place. The impetus for the plugin in the first place was that there were words that I have in my Gradint list that I forgot to unsuspend in my Anki deck. But I've stashed vocab in all sorts of odd places so I'm going to have to reverse engineer the connection between Anki and Gradint by hand...

I'm currently at the stage of feeling my way through Anki plugin programming. The next thing on my list is seeing what information I can get from the "current" deck, then can I suspend/unsuspend, can I get/set tags, etc., etc..
Reply
#24
so you basically write into the _en.txt file the english wort, for example "the climate", while for the _jp, you only need the mp3, right?

in the end, you had
001_jp.mp3 (containing the spoken 仕事)
001_en.txt (containing the written "the work")

is that correct?
Reply
#25
That's exactly right. It's fairly easy to do.

The first deck I did was the Tae Kim grammar deck. I exported "Facts in tab separated values", opened that up in a spreadsheet, saved it as .csv, as Python has CSV handling functions.

I deleted all the columns I didn't want, then ran a script that read that CSV and:

* worked out a file name from the Tags field (Basic Grammar, Essential Grammar, etc, with numbering)

* took the English text and created <filename>_en.txt

* took the MP3 filename and copied it to <filename>_jp.mp3

Pretty easy.

The only thing to look out for really is if you ever want to move samples files about. As the progress.txt file - which keeps track of what you've heard and how often - uses the file names, you need to use the script in samples/utils to move files, as it updates progress.txt accordingly. Otherwise you may break things.
Reply