Back

Use subs2srs to Create Anki Decks Based on Your Favorite Movie or Show

smartazjb0y Wrote:
cb4960 Wrote:
smartazjb0y Wrote:EDIT: Well, I got the English Subtiutles to work, but when I try to make flashcards with both English and Japanese subtitles, it skips the first few Japanese lines. As in, I can see the lines in the actual sub file, but Subs2srs skips those lines for some reason.
subs2srs will skip lines that begin with a "{" character. This is to prevent it from matching lines that only contain a single word or character with a fancy effect applied to it (usually karaoke effects). Here is an example (in this case each character changes color and rotates):

If you really want to see these lines, convert the subtitle file to .srt format.
Well, I've tried deleting all instances of { and I've tried converting to srt. There are a few problems though: When I convert the Japanese Subs to srt, the kanji are totally changed, so I just went into the ass file and deleted all { characters. With the English ones, I just converted it to SRT. Now, when I use subs2srs, with just Japanese subs alone, it works fine, but when I add in the English Subs, the English Subs are cut off at the end, meaning there are more subs in the file but they don't all appear in subs2srs, and the Japanese subs are mixed together, meaning there's more than one subtitle per line. I'm thinking there's way too much of a difference in the number of lines in the Japanese subtitles and the English Subtitles.
Does the following help at all?

subs2srs documentation Wrote:Note 1 (important): Subs1 is compared against Subs2 and not the other way around.

This means that if a subtitle file from Subs1 contains 300 lines and a subtitle file from Subs2 contains 310 lines, the maximum number of lines that will be processed is 300 (the number of lines from Subs1).

Of course, this also means that if a subtitle file from Subs1 contains 310 lines and a subtitle file from Subs2 contains only 300 lines, then 310 lines will be processed and 10 of those lines will be mismatched. See following note.

Note 2: If the timing of the lines from Subs1 and Subs2 are not perfectly 1:1 (which is very likely), the closest match from Subs1 to Subs2 will be used. This will sometimes results in mismatches. To help alleviate such problems, make sure "Fix mismatched lines" is checked.
Reply
cb4960 Wrote:Does the following help at all?

subs2srs documentation Wrote:Note 1 (important): Subs1 is compared against Subs2 and not the other way around.

This means that if a subtitle file from Subs1 contains 300 lines and a subtitle file from Subs2 contains 310 lines, the maximum number of lines that will be processed is 300 (the number of lines from Subs1).

Of course, this also means that if a subtitle file from Subs1 contains 310 lines and a subtitle file from Subs2 contains only 300 lines, then 310 lines will be processed and 10 of those lines will be mismatched. See following note.

Note 2: If the timing of the lines from Subs1 and Subs2 are not perfectly 1:1 (which is very likely), the closest match from Subs1 to Subs2 will be used. This will sometimes results in mismatches. To help alleviate such problems, make sure "Fix mismatched lines" is checked.
Not really. I've switched it so that Subs1 is the JP subs and Subs 2 is the English Subs, and it's still the same problem. I've done it without "Fix mismatched lines" and it's still the same problem.

If anyone wants to mess around with the 2 subtitles files I have, I can send it.
Edited: 2010-05-01, 11:37 am
Reply
Asriel Wrote:>errtuさん、
that's a quite difficult question to answer, in my experience. This is because most fansub (and professional subs, for that matter) will translate things in a way that sounds natural in English. Because of this, what they're actually saying at one point, and what the translation might say could be completely different.

What I do, when I use subs2srs or even just sentences in the wild, is that I don't use translations. I only save sentences/subs with an unknown word that I want to learn. If it's something like i+1, then I already understand the sentence, except for the 1 word, and don't need a translation.
thankxs nukemarine, asriel and juniperpansy for your answer. my thoughts exactly.i don't think i'll be using it any time soon. fan translations aren't literal translations and i don't think i should make them my foundation for Japanese. im not dissing the guys that use it though. i think it saves a lot of time to whomever likes it
Edited: 2010-05-02, 9:34 pm
Reply
May 16 - 30 : Pretty Big Deal: Save 31% on all Premium Subscriptions! - Sign up here
JapanesePod101
Hello,

I have just released version 19 of subs2srs.

Download subs2srs v19 via SourceForge

Many changes have been made to the Preview interface:

[Image: v19preview.png]

- You can now hand-pick exactly which lines will be processed (and go into the Anki import file) by setting them as active or inactive. This was a frequently requested feature. Active lines are displayed in pale green and inactive lines are displayed in pink. Using existing features such as the span options, prune options, and kanji-only options will also set lines to active or inactive. In the above screenshot, all lines with fewer than 9 characters and all lines with without a kanji were set to inactive.

- All episodes may now be previewed. Before, only the first episode could be previewed.

- Added a find box to make it easier to search for a specific line in the current episode. The search will start from the currently selected line and will wrap around to the beginning if necessary. The search is not case-sensitive. Wild cards and regular expressions are not supported at this time.

Special search options:

Begin the search text with “a:” to search only active items. Example search: “aConfusedearch text”. Omit the search text to search for the next active item.

Begin the search text with “i:” to search only inactive items. Example search: “iConfusedearch text”. Omit the search text to search for the next inactive item.

- Added a statistics box that allows you to see the number of lines in the current episode and the total number of lines in all episodes. It includes a breakdown of the number active and inactive lines.

- You may now edit the text for each line.

- Added a checkbox to turn off the image preview. Unchecking the box will speed up line selections.

- Added a Go! button that will process lines based on the user-provided active/inactive status.

Lyric (.lrc) files are now supported:

You can find .lrc files for a large number of anime related songs at kitsunekko.

The .lrc parser will try its best to remove the annoying metadata and advertisements found in many for the .lrc files found on the web.

Other minor things:

- Removed the first line comment in the Anki import file. For some reason Anki doesn't ignore this line on import.

- The .ass/.ssa parser now replaces "\N" style newlines with a space instead of just removing them as before.

- The .srt parser now removes the italic, bold, and underline style tags.

- Fixed the tab orders.

Things that didn't make it into this release:

- The Preview dialog still isn't supported for Linux/Mac.

- Some vobsubs are still garbled on Linux/Mac.

- You can't change the timings of each individual line.

- Still no parser for Transcriber files.

cb4960
Edited: 2010-05-04, 10:43 pm
Reply
subs2srs v19 Wrote:-Manually select active subs
-Search box
-Manually edit text
Sorry, I think I just peed myself in excitement. Plus, I have access to Windows 7, so I can use it properly now Big Grin

thankyouthankyouthankyou
Reply
Um, how do you import sub2srs output into Anki? I've tried it, selected Japanese deck model, and it only imported 2 huuuuge facts. This is what one card section of my output txt file looks like:

Special_001 001_0002_00.00.06.950 [sound:Special_001_00.00.06.650-00.00.12.200.mp3] <img src="Special_001_00.00.09.450.jpg"> A pathetic man who appeared in the wedding hall. 結婚式場に現れた 哀れな男である> His name is Iwase Ken <男の名前は 岩瀬 健。 I have seen hundreds of wedding ceremonies throughout the years <これまで 何百という 結婚式を見てきたが

How do I do it properly?
Reply
You're not going to want to use the "Japanese" deck model -- that's where you're screwing up.

The easiest way to do it is just to use the included sample deck that comes with subs2srs in the "Anki Deck Template" folder.

Subs2srs will give you a little alert box when it is done creating the output text file. Don't close this. It has a numbered list with field names that you'll need for later.

Open up the sample deck in the "Anki Deck Template" folder. Go to import from the file. Once you select the file and everything, it should ask you what fields get assigned to what section.
Go back to the aforementioned alert box and make everything line up.
(if the alert box says Field 1: Tag, then make sure you line up Field 1 to the "Tag" section. If Field 2 is supposed to be Subs1, then select Field 2 to be Subs1. It's pretty simple once you see it)

After this it should import fine.
If you use the sample deck, it should go a lot smoother for you.
Reply
Thanks! I'll tinker with it Smile
Reply
Despite the daunting technology, I managed to open up the subs2srs app via X11 on my iMac (running Snow Leopard). My problem starts after that. I have loads of original Japanese DVDs, some of which have Japanese and English subs. However, I'm not quickly locating a way to extract the subs. I tried one free program called DSubtitler but it had difficulty recognizing the Japanese. I gave up on that for a bit and have since searched the internet for a direct source subs, but am not finding any for the ones I'm looking for (Linda, Linda, Linda, the drama Stand Up!!, Survive Style 5, Traveling With Yoshitomo Nara, to name a few). If I am able to locate subs, they're usually in English and not in Japanese. I'm hopeful that I'm just a bad Googler. I guess I'll see what's appealing over at Kitsunekko.
Reply
Oh noes, I found a bug!

I set everything up and hit Preview, and got some beautiful cards, but realized my video had hardsubs. I modified the clipping to drop 20 pixels off the bottom and hit the repreview button. Now, each card says it can't determine video resolution in an ugly error box. I started from scratch with 30 pixels and same thing. Dropped the cropping and repreview and it works fine again.

Edit: If you need more, like screenshots or anything, holler.

Win7
nVidia
Intel Quad-core (I think it was quad)

(I forget the specifics, but they usually don't matter.)
Edited: 2010-05-26, 3:24 pm
Reply
wccrawford Wrote:Oh noes, I found a bug!

I set everything up and hit Preview, and got some beautiful cards, but realized my video had hardsubs. I modified the clipping to drop 20 pixels off the bottom and hit the repreview button. Now, each card says it can't determine video resolution in an ugly error box. I started from scratch with 30 pixels and same thing. Dropped the cropping and repreview and it works fine again.

Edit: If you need more, like screenshots or anything, holler.

Win7
nVidia
Intel Quad-core (I think it was quad)

(I forget the specifics, but they usually don't matter.)
Thank you for taking the time to report this.

I have determined the problem and luckily it's a quick fix. I will release an updated version sometime during the Memorial Day weekend.

It turns out that that ffmpeg outputs the video resolution in a slightly different way for some video files. Normally it's something like "1024x576,". However, in some cases it is something like "1900x1080 [PAR 1:1 DAR 16:9],".
Reply
Awesome. No rush for me, though. I just used mogrify afterwards to do the crop, so I'm good for now.
Reply
Hello,

I have just released version 19.1 of subs2srs.

Download subs2srs v19.1 via SourceForge

- Fixed the bug that threw an exception when certain video files were cropped. (thanks wccrawford)

- The Open and New menu options will now close the preview dialog.

- Added a single command line argument to specify a .s2s file to use on startup. Note: you can create a .s2s file with the File | Save... menu option.

- Added a hidden video preview option on the Preview interface. To enable the video preview button on the video preview interface, create a file named “video_preview.txt” in the same directory as subs2srs.exe. The first line of the file must be the full path to the video player that you want to use. The second optional line is any arguments that you want to pass to the player.

Example video_preview.txt file:
Code:
C:\Program Files\vlc\vlc.exe
--one-instance --start-time=${s_total_sec} --stop-time=${e_total_sec} --video-x=0 --video-y=0 --width=${width} --height=${height} --video-title=${s_hour}:${s_min}:${s_sec}
You may use the following tokens in the video player arguments line:
Code:
Token             Description
${s_hour}         Start time hours
${s_min}          Start time minutes
${s_sec}          Start time seconds
${s_hsec}         Start time hundredths of seconds
${s_msec}         Start time milliseconds
${s_total_hour}   Total start time hours
${s_total_min}    Total start time minutes
${s_total_sec}    Total start time seconds
${s_total_hsec}   Total start time hundredths of seconds
${s_total_msec}   Total start time milliseconds
Code:
${e_hour}         End time hours
${e_min}          End time minutes
${e_sec}          End time seconds
${e_hsec}         End time hundredths of seconds
${e_msec}         End time milliseconds
${e_total_hour}   Total end time hours
${e_total_min}    Total end time minutes
${e_total_sec}    Total end time seconds
${e_total_hsec}   Total end time hundredths of seconds
${e_total_msec}   Total end time milliseconds
${d_hour}         Duration hours
${d_min}          Duration minutes
${d_sec}          Duration seconds
${d_hsec}         Duration hundredths of seconds
${d_msec}         Duration milliseconds
${d_total_hour}   Total duration hours
${d_total_min}    Total duration minutes
${d_total_sec}    Total duration seconds
${d_total_hsec}   Total duration hundredths of seconds
${d_total_msec}   Total duration milliseconds
${width}          Width of the video
${height}         Height of the video
cb4960
Edited: 2010-05-29, 5:02 pm
Reply
ok, running on ubuntu 10.04
ffmpeg = 0.5.2
mono looks like this:

Mono JIT compiler version 2.4.4 (Debian 2.4.4~svn151842-1ubuntu4)
Copyright © 2002-2010 Novell, Inc and Contributors. http://www.mono-project.com
TLS: __thread
GC: Included Boehm (with typed GC)
SIGSEGV: altstack
Notifications: epoll
Architecture: x86
Disabled: none

I'm using mencoder to spit out idx/sub files

my video files are avi, fairly low quality

=====

1) every single sub .png is garbled - sometimes i can make out some of the characters but it's pretty much 100% garbage

2) if i try to generate snapshots it gets about halfway through and then crashes out:
FFmpeg version 0.5.2, Copyright © 2000-2009 Fabrice Bellard, et al.
configuration:
libavutil 49.15. 0 / 49.15. 0
libavcodec 52.20. 1 / 52.20. 1
libavformat 52.31. 0 / 52.31. 0
libavdevice 52. 1. 0 / 52. 1. 0
built on Jun 5 2010 18:57:48, gcc: 4.4.3
Input #0, avi, from '/home/michael/workspace/FightClub/Video/FIGHT_CLUB.avi':
Duration: 02:19:07.68, start: 0.000000, bitrate: 491 kb/s
Stream #0.0: Video: mpeg4, yuv420p, 416x176 [PAR 233:227 DAR 6058:2497], 23.98 tbr, 23.98 tbn, 23.98 tbc
Stream #0.1: Audio: mp3, 48000 Hz, stereo, s16, 48 kb/s
[imgconvert @ 0x9b3c190]PIX_FMT_YUV420P will be used as an intermediate format for rescaling
Output #0, image2, to '/home/michael/workspace/FightClub/Out/fc.media/fc_001_00.19.10.270.jpg':
Stream #0.0: Video: mjpeg, yuvj420p, 240x160 [PAR 186:115 DAR 279:115], q=2-31, 200 kb/s, 90k tbn, 23.98 tbc
Stream mapping:
Stream #0.0 -> #0.0
Press [q] to stop encoding
frame= 1 fps= 0 q=2.8 Lsize= -0kB time=0.04 bitrate= -4.2kbits/s
video:6kB audio:0kB global headers:0kB muxing overhead -100.374341%

** (subs2srs.exe:2108): WARNING **: CreateProcess: error creating process handle
Method System.Diagnostics.ProcessStartInfo:get_Arguments () emitted at 0xb3e74d70 to 0xb3e74d7e (code length 14) [subs2srs.exe]
Method System.Diagnostics.ProcessStartInfo:get_WorkingDirectory () emitted at 0xb3e74d80 to 0xb3e74d8e (code length 14) [subs2srs.exe]
Method System.ComponentModel.Win32Exception:.ctor (int,string) emitted at 0xb3e74d90 to 0xb3e74db2 (code length 34) [subs2srs.exe]
Method System.Runtime.InteropServices.ExternalException:.ctor (string) emitted at 0xb3e74dc0 to 0xb3e74de3 (code length 35) [subs2srs.exe]

it goes on for a little longer than that, i can attach a txt or something if you'd like

but yeah, a question on the operation of the program - is it possible to spit the subs out as raw text (as shown in the example shot on page 1)???
Reply
chamois,

1) This is a known issue for the Mono version of subs2srs. I'm not quite when I'll get around to fixing it though.

2) Does it happen for all videos files or just when making snapshots for that particular one? Does it always happen at the same point in that particular video? In any case, subs2srs should operate a bit more gracefully under these circumstances. I'll put it on my TODO list.

3) Sorry, subs2srs will never support conversion of Vobsubs (.idx/.sub) to text. You might be able to use SubRip for this purpose. Though I'm not sure how well it works with Japanese Vobsubs. With subs2srs, the only way to get text is to use .srt, .ass, .ssa, or .lrc subtitle files as input.
Reply
1) *sad face* I guess i'm going looking for a windows machine this afternoon. sorry i didn't realise it was a known issue, i saw it on asriel's post a page or 2 back, but thought it was a 1 off.
EDIT: so i'm running through the same disc now with sub rip and it looks like heaps of the images are garbled. something to do with a "zero space" between the lines in the stream? idk, weird...

2) it was with 1 particular file, the only one i've tried so far. i'll rip another disc today and see if it's that file type, the file or the program/dependecies.


3) lols, this has been the hardest part of the whole thing - actually extracting subs on a linux box. every piece of software i've tried has given me problems, even getting a mencoder string working well was tricky. i've been hesitant to just give up and use wine, but i guess it's the only way to go. sigh, you've let me down, ubuntu Sad

but yeah, i'm going to go to mum's place and rip this disc on her windows box so i can get to srsing it, and i'll try and troubleshoot all of this stuff later. will report back Big Grin
Edited: 2010-06-05, 7:13 pm
Reply
Hello,

I have just released version 19.2 of subs2srs.

Download subs2srs v19.2 via SourceForge

I made a couple of modifications to the Extract Audio from Video Tool:

[Image: v192extractaudiotool.png]

- Added the ability to select the audio stream. Useful for those pesky dual audio releases.

- Added the ability to specify a span. Useful for chopping off those pesky intros/outros.

Other stuff:

- In the The Advanced Subtitle Options Interface's Prune tabs, increased the upper limit of the "Exclude lines ..." options to 99999.

cb4960
Edited: 2010-06-05, 7:18 pm
Reply
chamois Wrote:1) *sad face* I guess i'm going looking for a windows machine this afternoon. sorry i didn't realise it was a known issue, i saw it on asriel's post a page or 2 back, but thought it was a 1 off.
EDIT: so i'm running through the same disc now with sub rip and it looks like heaps of the images are garbled. something to do with a "zero space" between the lines in the stream? idk, weird...

but yeah, i'm going to go to mum's place and rip this disc on her windows box so i can get to srsing it, and i'll try and troubleshoot all of this stuff later. will report back Big Grin
1) i've managed to pull perfectly fine images directly from that disc after ripping it with dvd decrypter and then using subrip to extract them if that's any help for getting to the root of that problem

windows issues were just as bad - .net issues, other random ripping issues, i gave up Sad

so anyway, i think i'm just going to take a leaf from ice cream's book and just make a "call and response" style deck, something that subs2srs doesn't really lend itself to. what i really need is something that can spit out screen caps and x-seconds of audio based on a timecode i supply (in bulk form) but that really goes beyond the scope of what subs2srs is there for so i'll just be going manual styles Big Grin

thanks for your help none the less.
Reply
I have a .mkv file with dual audio, Chinese and English, however I can't chose an audiostream in neither subs2srs or the extract audio option. Sub2srs returns an error message, when trying to start.
Any ideas?
Got it working by deleting an audio-channel with
http://www.bunkus.org/videotools/mkvtool...ml#windows
Edited: 2010-06-18, 5:59 am
Reply
Hello,

I have just released version 19.3 of subs2srs.

Download subs2srs v19.3 via SourceForge

Changes:

- Added better support for multi-channel audio.
- Upgraded to latest ffmpeg.

These changes should reduce the frequency of this annoying error message: “Failed to extract the audio from the video. Make sure that the video does not have any DRM restrictions.”

cb4960
Reply
You know I've always wanted to use this software but for some odd reason I couldn't get it to work fully. I gotta try it again soon.
Reply
HerrPetersen Wrote:I have a .mkv file with dual audio, Chinese and English, however I can't chose an audiostream in neither subs2srs or the extract audio option. Sub2srs returns an error message, when trying to start.
Any ideas?
Got it working by deleting an audio-channel with
http://www.bunkus.org/videotools/mkvtool...ml#windows
Do you happen to remember what the error said?

Enter the following command in the command prompt and paste the results here:
ffmpeg -i your_video_file.mkv

This will tell me a little more about the audio streams.

Thanks.
Reply
Here a link to the output:
http://www.mediafire.com/imageview.php?q...4z&thumb=4

subs2srs just told me to fix the line with the video-file, I do not remember the exact error message.

EDIT: Just redid the process and I get: "please correct the error on this form"
Edited: 2010-06-19, 4:33 am
Reply
Right now I am using subs2srs quiet a lot in order to build a database. Most of the time it works great. However I ran into another problem in one specific file:

It is an .avi file with ac3 encoded audio. subs2srs says something like "make sure audio stream is not drm-protected".
I used a program to convert the ac3 to mp3, but when using subs2srs, audio vs subs are not in synch, however when playing the vid with a player the subs are in synch.

EDIT: the problem with the AC3 sound happended on one more occasion Sad kinda strange, because I already used an earlier version of subs2srs for exactly this movie file.
Edited: 2010-06-19, 10:08 am
Reply
I'm wondering if how it breaks audio up into smaller segments can be improved.

Currently the file name is pretty long. However, to be of use in iPod devices one would need to manually go into each property and change the "song title". If one doesn't do that, then I think iPods will ignore files with the same song time.

It's this reason I'm currently using subs2srs to rip audio from the videos (works great with bulk projects), but then individually I have to use Audacity to split up the audio. Audacity copies over the Artist, Album title and Genre info from the larger file then adds a " - xx" to both the file name and "song title". This process is fairly tedious.

More to the point: When subs2srs splits an audio file, can it not only name the splits but also fill in the song properties based on sequences instead of time.

Say we have a main video with file names "Around 40 ep 01.avi" through "Around 40 ep 10.avi". Subs2srs already asks what the naming would be. In this example I'd put "Around 40" with the sequence starting at 01. So based on that it'll automatically do the following.

Artist: "Around 40" (all files will have this be the same).

Album: "Around 40-05"; This should match the large video file name, sequentially numbered for each video being ripped. So number five file name should be "Around 40-05.mp3" (all split files of the same episode should have this be the same in properties)

Title: "Around 40-08-12"; This should match the split audio file name, sequentially numbered for each portion being split. So the file name of this split audio should be "Around 40-08-12" which is the 8th video being ripped, and the 12th file split.
Reply