Back

aligner: Fully automatic subtitle correction

#26
Has anybody succeeded in generating a Mac OS binary?
Reply
#27
(2017-02-21, 2:37 am)jmignot Wrote: Has anybody succeeded in generating a Mac OS binary?

It doesn't seem that anyone has. Don't worry, It's Pretty Easy™. Unfortunately, I can't cross-compile from Linux to macOS without some special setup and an own copy of macOS (why, just why?), which I don't have. You will have to
That's it. If you run into trouble, you can post the errors here so I can try to help.
Reply
#28
(2017-02-22, 11:18 am)kaegi Wrote:
(2017-02-21, 2:37 am)jmignot Wrote: Has anybody succeeded in generating a Mac OS binary?

It doesn't seem that anyone has. Don't worry, It's Pretty Easy™. Unfortunately, I can't cross-compile from Linux to macOS without some special setup and an own copy of macOS (why, just why?), which I don't have. You will have to
That's it. If you run into trouble, you can post the errors here so I can try to help.

Thank you for the instructions. I will try this whan I have some time.
Just out of curiosity, what is that software I am supposed to install first? I have read that Rust is actually a language. Is your program written using that language?
Reply
See this thread for Holiday Countdown Deals (extended to Dec 26th)
JapanesePod101
#29
Yes, my program is programmed in Rust and therefore needs a Rust compiler. Cargo is the build system + package manager, which (among other things) manages the dependencies of Rust projects.
Edited: 2017-02-23, 3:39 am
Reply
#30
Thank you !
Reply
#31
Pretty sweet. I fixed the timing on a few shows previously manually - Non Non Biyori, White Album 2, and now Fate/Zero using the tool, and put them up on Kitsunekko.
It's like 99% good, only missing a few lines for some reason in the middle of some episodes, but way better than manual labor. I haven't had issues dealing with Openings/Endings so far. I thought I had to delete them, but it seems to work either way.
Reply
#32
Glad to hear it's working for you.

I'm pretty interested in the "missing a few lines for some reason in the middle of some episodes"-part, so if you have some subtitles that show this problem, I'd like to see what I can do about it. I put up a link previously in this thread where you can upload them.
Reply
#33
To the one who just uploaded a pair of "Hunter x Hunter" subtitles. What is wrong with these subtitles? With a split-penalty of 0.4 aligner gets the 6 opening lines and the three extra lines in the middle of the episode right... I can't spot anything wrong with the corrected file by comparing it to the reference file (I don't have the episode, so maybe I miss something obvious).
Reply
#34
(2017-03-01, 5:09 am)kaegi Wrote: To the one who just uploaded a pair of "Hunter x Hunter" subtitles. What is wrong with these subtitles? With a split-penalty of 0.4 aligner gets the 6 opening lines and the three extra lines in the middle of the episode right... I can't spot anything wrong with the corrected file by comparing it to the reference file (I don't have the episode, so maybe I miss something obvious).
Sorry, that was me. I did try high and low split-penalties but I guess I didn't try low enough. I see it works now, thanks for checking!
Reply
#35
(2017-02-26, 3:39 pm)kaegi Wrote: Glad to hear it's working for you.

I'm pretty interested in the "missing a few lines for some reason in the middle of some episodes"-part, so if you have some subtitles that show this problem, I'd like to see what I can do about it. I put up a link previously in this thread where you can upload them.

I checked back at the original Japanese subs, and it seems that a minute of dialogue was just not subbed for the episode. The Aligner tool did a good job though, and was able to keep the proper jp lines when the unsubbed scenes ended. Seriously this tool is amazing.
I haven't really learned how to mess with the split penalty parameter, though. I haven't had any issues with the shows I re-timed with the default parameters, at least from what I've seen.

Oh, I also made a video tutorial on how I automate the naming and process to fix a long show. (Ex. Rurouni Kenshin)

Reply
#36
(2017-03-04, 10:45 pm)vladz0r Wrote:
(2017-02-26, 3:39 pm)kaegi Wrote: Glad to hear it's working for you.

I'm pretty interested in the "missing a few lines for some reason in the middle of some episodes"-part, so if you have some subtitles that show this problem, I'd like to see what I can do about it. I put up a link previously in this thread where you can upload them.

I checked back at the original Japanese subs, and it seems that a minute of dialogue was just not subbed for the episode. The Aligner tool did a good job though, and was able to keep the proper jp lines when the unsubbed scenes ended. Seriously this tool is amazing.
I haven't really learned how to mess with the split penalty parameter, though. I haven't had any issues with the shows I re-timed with the default parameters, at least from what I've seen.

Oh, I also made a video tutorial on how I automate the naming and process to fix a long show. (Ex. Rurouni Kenshin)


Wow, thanks! YouTube/Video tutorials have a very wide range these days, I think this will be helpful for a lot of people.

You might want to check out some scripting magic, where you can combine many steps into one. Example with bash, which more or less combines all steps in your tutorial:

Code:
for i in {01..03}; do
  aligner *" $i "* *$i.srt corrected$i.srt
done


This will resolve with these steps to the final commands:

Code:
# {01..03} will expand to 01 02 03

for i in 01 02 03; do
  aligner *" $i "* *$i.srt corrected$i.srt
done

# for i in 01 02 03; do command; done  means "write that command with i=01, i=02, i=03"

aligner *" 01 "* *01.srt corrected01.srt
aligner *" 02 "* *02.srt corrected02.srt
aligner *" 03 "* *03.srt corrected03.srt

# the * is a "wildcard" and will insert at this place all files that match the string.
# So *" 01 "* will match, for example, "YourSub 01 [ABCD].ass" (anything with " 01 " inbetween).
# Quotes are needed there, so that bash knows the spaces belong to the first argument.

aligner "YourSub 01 [ABCD].ass"                              timed01.srt corrected01.srt
aligner "CompletelyOtherNameButSameFormat 02 OtherInfo.srt" "timed 02.srt" corrected02.srt
aligner "YourSub 03 [EFGH].srt"                             "timed03.srt" corrected03.srt

The exact syntax of number ranges, for-loops and wildcards might be (slightly) different in batch than bash.

You can play around with the split-penalty this way:
Code:
aligner ref.srt inc.srt corrected.srt --split-penalty 0.9
Edited: 2017-03-05, 6:52 am
Reply
#37
Has anyone succeeded compiling from source? Using
Code:
$ cargo install aligner

I'm getting the same error compiling one of the libraries (image v0.12.4) used on both my Win 10 x64 machine and my archlinux x64 VM.

Code:
error[E0004]: non-exhaustive patterns: type image::ImageFormat is non-empty
  --> .cargo\registry\src\github.com-1ecc6299db9ec823\image-0.12.4\./src\dynimage.rs:611:11
   |
611 |     match format {
   |           ^^^^^^
   |
help: Please ensure that all possible cases are being handled; possibly adding wildcards or more match arms.
  --> .cargo\registry\src\github.com-1ecc6299db9ec823\image-0.12.4\./src\dynimage.rs:611:11
   |
611 |     match format {
   |           ^^^^^^

error: aborting due to previous error
EDIT:
Okay well apparently it works when I git clone the github repo and just run `cargo build --release` in that directory so I guess I'll just go with this.

EDIT 2: I did this with the latest commit as of this post (6f4e7158c328b1c81739f3cbea75410e7f90c632), just in case anyone is wondering.
Edited: 2017-05-05, 9:29 pm
Reply
#38
(2017-05-05, 5:40 pm)karageko Wrote: Has anyone succeeded compiling from source? Using
Code:
$ cargo install aligner

I'm getting the same error compiling one of the libraries (image v0.12.4) used on both my Win 10 x64 machine and my archlinux x64 VM.

Yes, that library seems to break with the latest version under certain circumstances. I reported the error here and here. If either of them gets approved, the error will be resolved.

Thank you for informing me of this problem!
Reply
#39
Okay, compiling works again. It took longer than I anticipated, but thanks to the cooperation of emk, the problem is fixed!
Reply
#40
Hey kaegi, I'm trying to retime the subs for Ping Pong, using the [Leopard-Raws] subs from kitsunekko and the [deanzel] English release.  Here are the english reference subs and the incorrectly timed JP subs for the first episode.

http://www.mediafire.com/file/7k63z7q21g6gfy7/eng01.ass
http://www.mediafire.com/file/ydcndcg7hl...fore01.srt

No matter what split penalty I use, it never lines up quite right.  The best I can get it is the first line in the corrected file starting at 00:23.65 whereas in the (correctly timed) English file it starts at 00:23.10.  Any ideas?
Reply
#41
(2017-06-19, 12:44 pm)Xavier22 Wrote: Hey kaegi, I'm trying to retime the subs for Ping Pong, using the [Leopard-Raws] subs from kitsunekko and the [deanzel] English release.  Here are the english reference subs and the incorrectly timed JP subs for the first episode.

http://www.mediafire.com/file/7k63z7q21g6gfy7/eng01.ass
http://www.mediafire.com/file/ydcndcg7hl...fore01.srt

No matter what split penalty I use, it never lines up quite right.  The best I can get it is the first line in the corrected file starting at 00:23.65 whereas in the (correctly timed) English file it starts at 00:23.10.  Any ideas?

The English subtitle has many extra lines compared to the Japanese subtitles. By deleting the lines that begin with '*' you can get a slightly better alignment (though that is hard to check without the video). It's probably still not in the 'acceptable' range. I think you will have to adjust the sub/video-sync with your movie player in this case - being 'roughly correct' is a huge help nonetheless.

Keep in mind: If you frequently need to re-adjust (every minute or so), then the Japanese subtitle doesn't fit to the video. The algorithm can only micro-correct subtitles (meaning something like a 0.1s longer black screen between scenes) on a very limited scale.
Reply
#42
I just wanted to drop by again and say how awesome this program is. I've returned to working on aligning the jp subtitles I had for Odoru Daisousasen to go with the eng subtitles.

It just makes this job SO much easier.
Reply
#43
I just wanted to say thank you to the creator of this program.
Reply
#44
This is brilliant kaegi, a few curls and a bash script and one whole series is done easily. Crazy.
Thank you!
I've timed shows by hand (like a decade ago) and I just cannot explain how cool I find this tool.

Hints for people on yum/dnf utilzing linux distros, rust and cargo are available in the standard repos so:
sudo yum install rust cargo
cargo install aligner
export PATH=$PATH:/home/YOURUSERNAME/.cargo/bin
Reply
#45
Hey kaegi, found an error. Here are the logs from cmd.

E:\Anime\Code Geass - Hangyaku no Lelouch R2\Subs>aligner ref01.ass old01.ass ne
w01.ass
EE: error: operation on file 'old01.ass' failed
EE: caused by: invalid utf-8: invalid byte near index 0
EE: note: run program with `env RUST_BACKTRACE=1` for a backtrace

Here are the files:
ref01.ass: http://www.mediafire.com/file/gzg6qzkrjaax7qu/ref01.ass
old01.ass: http://www.mediafire.com/file/66tzgj993u4ut7k/old01.ass

The "old01.ass" file is just a renamed file from the POPGO subs for Geass R2 on kitsunekko. Note that I can fix this issue by opening the old01.ass file in Aegisub and changing literally anything, like adding a letter to one line, and then it processes properly. So it's not really a big deal but I thought you might be interested in seeing it.
Reply
#46
(2017-09-03, 10:40 am)Xavier22 Wrote: Hey kaegi, found an error.  Here are the logs from cmd.

E:\Anime\Code Geass - Hangyaku no Lelouch R2\Subs>aligner ref01.ass old01.ass ne
w01.ass
EE: error: operation on file 'old01.ass' failed
EE: caused by: invalid utf-8: invalid byte near index 0
EE: note: run program with `env RUST_BACKTRACE=1` for a backtrace

Here are the files:
ref01.ass: http://www.mediafire.com/file/gzg6qzkrjaax7qu/ref01.ass
old01.ass: http://www.mediafire.com/file/66tzgj993u4ut7k/old01.ass

The "old01.ass" file is just a renamed file from the POPGO subs for Geass R2 on kitsunekko.  Note that I can fix this issue by opening the old01.ass file in Aegisub and changing literally anything, like adding a letter to one line, and then it processes properly.  So it's not really a big deal but I thought you might be interested in seeing it.

Yeah, the problem here is that old01.ass is UTF-16 encoded while the program assumes UTF-8 encoding (quasi-standard encoding). Aegisub probably recognizes it correctly as UTF-16 but saves it as UTF-8. I'll look into how other programs like Aegisub handle it...
Reply
#47
The latest version 0.1.6 now supports choosing an encoding and prints a better error message if the encoding is wrong!

For example choosing UTF-16 for the reference subtitle file looks like this: 

Code:
aligner reference.ass incorrect.ass correct.ass --encoding-ref utf-16
Edited: 2017-09-03, 1:40 pm
Reply
#48
Thanks. Always a great program. I have a huge backlog of shows with subs I need to time and this is the #1 way to do that.
Reply
#49
Works great, although I usually need to go in and edit subs manually due to different decisions around subbing music, particularly OP/ED. Thanks for the great tool!
Reply
#50
(2017-11-28, 11:53 pm)NinKenDo Wrote: Works great, although I usually need to go in and edit subs manually due to different decisions around subbing music, particularly OP/ED. Thanks for the great tool!

Thanks! Obligatory question: have you tried different split penalties (3, 2, 1, 0.5, ...)? Sometimes the default value is just a little bit too rigid...
Reply