bombpersons Wrote:blackmacros Wrote:I searched and found this just now http://www.my-guides.net/en/content/view/167/26/ after nonpoint's post. Doesn't look too difficult. Although I don't know how it will handle Japanese subs (with the whole "type in every character from the subs file so that it can OCR it"...that might take a while)
I've done this before. With Final Fantasy Advent Children it took me about 4-5 hours to OCR the subs.
ghinzdra Wrote:does anyone know where to find a kanji character matrix ?
it seems like there are some hanzi/chinese character matrix out there but I m looking for japanese . (And I don t even find the chinese one ... it could be of some use as I bet the guys who wrote these must have an idea about where to find a japanese one)
Mh.. This might be faster. Is there a way to save a matrix / make one? If just one of use makes one, someone can upload it and hopefully this will be much faster.
as I said it s not THAT easy.
first you must be aware what is exactly a kanji matrix : something like 8000 characters .
through KO and others we all know with about 1200 character you have a 85 percent coverage but for a 99 percent it s about 4000 . On top of that you ve got to include italic and other style ... which is why you can easily double the 4000 figure hence 8000 Obviously the most intelligent solution would be to write a matrix for the most frequent and after that typing in the missing characters in the subtitles . We re still in for typing 3000 characters
What s more as kanjis are combination of different part , the OCR can have some trouble to identify the character and just select a part of it . That s why subrip include a new feature to enlarge the grid if needed. It still slow down the process though
Now the best part :
do you think that every company use the same character set/style ?
do you think that the OCR software is particularly clever ?
You must have easily figured out where I m going There is a very high risk that we need several matrix....in the best case one by company ... in the worst one by DVD....
so yes you can save the matrix but if we need one matrix by dvd we re screwed
anyway as I said earlier none of my dubbed dvd have faithful subtitles... so it s just in case the native dvd have exact subtittle otherwise it s both excruciating and useless
that s why I said it s very likely it s a dead end . I just want to check out every possibilities before posting a tutorial .
Edited: 2009-08-19, 6:46 am