Back

scanki: new smart.fm downloader

#1
compared with the smart.fm import plugin:

* multiple lists can be downloaded in a single operation
* vocab items and sentences are together in the same fact
* kana reading is preserved
* includes sentence <b> tags, vocab parts-of-speech, remote sound links, list id and per-list and overall indices
* doesn't exclude entries with non-unique kanji fields but with unique combinations of other fields

if none of that matters to you then you don't need this and you can continue to use Joe's great plugin

please note that this is a unix shell script not an anki plugin, so you'll need a unix-like environment to run it. linux is good, OSX should be fine, but for windows you'll probably need cygwin

http://sites.google.com/site/ankinihongo/home/scanki
Edited: 2011-01-27, 8:50 pm
Reply
#2
Cranki and scanki.
Hmmmm.... I... err, well.....
Those are two very scary adjectives.
Reply
#3
I think they're clever.
Reply
May 16 - 30 : Pretty Big Deal: Save 31% on all Premium Subscriptions! - Sign up here
JapanesePod101
#4
OK, n00b question time! Admittedly, I'm a 'tard with Unix (and allergic to Apple products) so I'm having some trouble getting this badboy to cooperate in Windows. Specifically, the error I get in cygwin is:

Code:
warning: failed to load external entity "http://api.smart.fm/lists/19053/items.xml?per_page=100&page=1"
and so forth until it runs through the list. xmlstarlet doesn't seem to want to play nice. Any advice? I'm probably missing something relatively simple, but I'm stumped for the moment. (汗) These scripts will be absolutely godly once I can figure it all out, though!
Reply
#5
bodhisamaya Wrote:Cranki and scanki.
Hmmmm.... I... err, well.....
Those are two very scary adjectives.
shirokuro Wrote:I think they're clever.
thanks, they were my favourites, not sure which should be next though: franki, hanki (panki), yanki, clanki, lanki, spanki, swanki, manki, wanki?

Burritolingus Wrote:xmlstarlet doesn't seem to want to play nice
I haven't tried it under cygwin but xmlstarlet claims to work there, so it should be ok... is cygwin networking ok? can you wget the same url? you could also try the native windows version instead

btw, if you just want the core series, here's one I prepared earlier
Edited: 2010-02-27, 10:04 am
Reply
#6
cangy Wrote:
Burritolingus Wrote:xmlstarlet doesn't seem to want to play nice
I haven't tried it under cygwin but xmlstarlet claims to work there, so it should be ok... is cygwin networking ok? can you wget the same url? you could also try the native windows version instead

btw, if you just want the core series, here's one I prepared earlier
Hmm, wget works fine... Ahh well, was mostly just curious and felt the need to tinker. I'll just grab the kore text files for now. Thanks a bunch for all of these goodies, cangy!

Also, I vote for clanki.
Reply
#7
Burritolingus Wrote:Hmm, wget works fine...
in that case it's a trivial change to use wget for the download -- here you go
Reply
#8
When I try to download audio with this, it gives me the following error:

Code:
../scanki-1.22.sh -s < ../core-lists.txt
[: 86: sound: unexpected operator
[: 86: true: unexpected operator
It still downloads all the vocab audio though. Well, 5,991 files. I don't know if there should be 6,000 or not.

Also, is it supposed to download the sentence audio? It only downloaded the vocab audio. I didn't use the -v option.

Thanks.
Reply
#9
@fluxcapacitor:

change the first line of the script to:

Code:
#!/bin/bash
Note: bash is *not* the default shell on every system, so if you write a script for /bin/sh, don't use bash-specific syntax.

And the script has not been working correctly. It should download ~12,000 files as there's two audio files for each item: one for the vocabulary and one for the example sentence.
Edited: 2010-03-04, 1:27 am
Reply
#10
cangy Wrote:
Burritolingus Wrote:Hmm, wget works fine...
in that case it's a trivial change to use wget for the download -- here you go
I had missed this post! Thanks a bunch, broheim.
Reply
#11
Thank you kriskelvin. Smile
Reply
#12
kriskelvin Wrote:Note: bash is *not* the default shell on every system, so if you write a script for /bin/sh, don't use bash-specific syntax.
actually, due to a previous life as a sysadmin, I never write bash scripts -- those == must have snuck in from some other language...

fixed in the current version
Reply
#13
cangy Wrote:
Burritolingus Wrote:Hmm, wget works fine...
in that case it's a trivial change to use wget for the download -- here you go
edit: I'm a tard!

FYI - how to get this to work on Windows:

o Download Cygwin from here http://cygwin.com/setup.exe
o When your at the stage in the installer when it asks you what packages to install, search for wget and click the box to say 'install'
o let the installer run (it takes a while)
o Next you need to get xmlstarlet from here: http://sourceforge.net/projects/xmlstar/
o it comes as a zip. open the zip and drag and drop the xml.exe to your cygwin bin folder eg: c:\cygwin\bin
o rename the xml.exe to xmlstarlet.exe as that's what the scanki script is expecting (or you can rename it in the script)

that should be that. the rest of the info in on the scanki dl page
Edited: 2011-01-29, 3:06 am
Reply
#14
mistamark Wrote:edit: I'm a tard!

FYI - how to get this to work on Windows:

o Download Cygwin from here http://cygwin.com/setup.exe
o When your at the stage in the installer when it asks you what packages to install, search for wget and click the box to say 'install'
o let the installer run (it takes a while)
o Next you need to get xmlstarlet from here: http://sourceforge.net/projects/xmlstar/
o it comes as a zip. open the zip and drag and drop the xml.exe to your cygwin bin folder eg: c:\cygwin\bin
o rename the xml.exe to xmlstarlet.exe as that's what the scanki script is expecting (or you can rename it in the script)

that should be that. the rest of the info in on the scanki dl page
Once I've done all that, how I am supposed to make the plugin work?
I have no computering skills at all :/
As the plugin is not an anki-plugin-file (.pyt or something like that if I recall well) I suppose I should run in through xmlstarter.exe or through cgwin.bat. However I tryed both with no results (I'm pretty sure I'm doing something wrong or missing some passage). Do you mind explaining me the how to make it run?
I've alrady done all the passages in the quote Tongue
Reply
#15
Make sure you put all the scanki files in the cygwin folder. You then need to start Cygwin to gain access to the 'bash' shell which is the environment the script is written for. There should be a terminal window reminiscent of DOS (cmd.exe). All you have to do to run the script is use the syntax described on the scanki page.

For example,
Code:
./scanki.sh [-v] < ./core2k-lists.txt > ./core2k.txt
(where -v is optional) would download core2k and dump it in a file called core2k.txt.
Edited: 2011-02-05, 4:49 am
Reply
#16
Shakunatz Wrote:< snip >
FYI - how to get this to work on Windows:

< snip >

Do you mind explaining me the how to make it run?
The rest of the instructions are at the scanki site. but for completeness:
TheSite Wrote:Usage: scanki [OPTION] < LISTSFILE

-s download sound
-v include vocab only (no sentences)
LISTSFILE: smart.fm list ids, 1 per line

Examples:

$ ./scanki [-v] < ./core2k-lists.txt > ./core2k.txt
or
$ mkdir core2k.media; cd core2k.media; ../scanki -s [-v] < ../core2k-lists.txt
What does this mean?

o Make a textfile (in notepad) with one smart.fm goal id per line.ie NOT the http://smart.fm/goals/ bit. eg:

73351
73438
73450

and save it as eg: c:\DownloadThese.txt

o next click on 'Cygwin' -> 'Cygwin Bash Shell' in the Start Menu in Windows.

o wait a few seconds while it loads up. You should see some green text and then a $

o [-1-] Now, I want to save the audio that's downloaded from smart.fm to my C: drive in 'Deckname.media' folder, so first make the directory by typing this (change the name to suit)
Code:
mkdir /cygdrive/c/Deckname.media
<then hit enter>

o [-2-] next, you need to set it as the current directory, so type:
Code:
cd /cygdrive/c/Deckname.media
<then hit enter>

o next we need to run scanki.sh (I have copied the scanki.sh file to my C: drive) and feed it the textfile with the list of smart.fm goal IDs in, so type :

Code:
/cygdrive/c/scanki.sh -s < /cygdrive/c/DownloadThese.txt > /cygdrive/c/FromScanki.txt
<then hit enter>

o that should do the trick.

o after you've done this, you'll have a text file called 'FromScanki.txt' in your C drive and you can now import that into Anki like normal.

-------

The '-s' means download the sound files.

If you don't want to download the soundfiles then don't make the folders (ie, don't do steps [-1-] and [-2-] ) and don't put the '-s' in.

If you only want to download the vocab (Smart.fm won't let you download the vocab audio anymore) put a '-v' instead of the '-s'

Good luck.
Reply
#17
Thank you a lot, you wrote a very detailed guide! You couldn't have been more clear ^.^ I'll test that this evening. Thank you again Tongue
Reply