s0apgun
鬼武者 ᕦ(ò_óˇ)ᕤ
From: Chicago
Registered: 2011-12-24
Posts: 453
Website
皆さん、こんにちは!
Today I present a simple alternative to sub2srs for the purpose of just reading practice.
The problem: Finding interesting and challenging reading material outside of n+1/graded readers that is available in digital text that can be parsed fast with the use of Rikai-chan.
My solution: Use the .SRT (subtitle) files from your favorite anime or drama you're currently watching. I am currently watching Uchuu Kyoudai so I will use that for the example.
When you open a .SRT in your text editor it will appear something like this...
1
00:01:18,275 --> 00:01:21,344
(南波六太)<宇宙飛行士候補生→
2
00:01:21,344 --> 00:01:25,515
通称アスキャンとなった私達は
他の国の候補生達との→
3
00:01:25,515 --> 00:01:30,504
合同基礎訓練が行われるアメリカ・ヒューストンへとやってきた>
4
00:01:30,504 --> 00:01:36,593
<ケンジ せりかさん 北村さん
新田 JAXAのオカンこと→
Lots of good stuff in there right? Lot's of junk too huh...
Okay so all you have to do is use the replace function! You can remove all the numbers, symbols, and line breaks to leave you with just a beautiful wall of Japanese.
In the "Find what:" field enter a number or symbol (copy paste) and leave the "Replace with:" field empty. To remove paragraphs enter ^p in the find what field. After that you should be left with something like this from my example.
(南波六太)宇宙飛行士候補生 通称アスキャンとなった私達は他の国の候補生達との合同基礎訓練が行われるアメリカ・ヒューストンへとやってきた>ケンジ せりかさん 北村さん新田 JAXAのオカンこと...
This turns a 30~ page .SRT file into a 3 page word document or about the length of 3 NHK News articles that you can read in a sitting. From here you can just copy paste the wall of Japanese and e-mail it to yourself for use with your rikai plugin in your browser for fast dictionary look-ups. I know my method is crude but I hope it might help someone on here! Thanks for reading, let me know what you think?
Note: I used Microsoft word for the ^p thing... not sure if that works with other text editors.
NOTE: Sub2srs rules... this is just a mediocre option for relevant reading practice outside of SRS.
Edit: Site I used to grab the Japanese .SRT files. http://kitsunekko.net/subtitles/japanese/ There is also software to extract embedded .SRT files from your media.
Last edited by s0apgun (2013 March 06, 4:41 am)
shinsen
Member
Registered: 2009-02-18
Posts: 181
s0apgun wrote:
Note: I used Microsoft word for the ^p thing... not sure if that works with other text editors.
It's easy to do this in the Unix terminal, e.g. on OS X. To simply remove timecodes (this is a long command that should be on one line):
To remove timecodes and linebreaks:
Instead of emailing the text just start a simple webserver in the same directory:
And access the text in the browser at localhost:8000/episode01.txt
I do something like this for subtitle files. If a subtitle is in SRT, I first convert it to .ASS (heheheheh) format using Subtitle Edit, which gives me subs in a nice tabular format:
Dialogue: Marked=0,0:00:25.98,0:00:29.97,Default,NTP,0000,0000,0000,!Effect,(山里) 随分とまた\Nメルヘンチックな夢ですね。
Dialogue: Marked=0,0:00:29.97,0:00:34.48,Default,NTP,0000,0000,0000,!Effect,(志岐 貴) 意識の抑圧が緩和され\N精神年齢が下がってるんだ。
I then import this as a CSV file into Excel or Google Docs, and then select just the column containing Japanese text. I paste this text into a file to get myself a nice, readable transcript with all of the timing info stripped out. With just a little additional formatting using Find/Replace in Notepad++, I can get something like this: http://www.gaiaslastlaugh.com/n/akumu1.html
I can then listen to the ripped track on my iPhone, and check out the transcript if I can't for the life of me understand a given sentence. Plus, when at home, I can use Rikai-chan on my HTML file. I enjoy that a lot more than using subs2srs.
-J-
shinsen
Member
Registered: 2009-02-18
Posts: 181
astendra wrote:
Why start a web server when you can just do: file:///path/episode01.txt
This will work, although in Chrome you'll first need to go into settings and the allow the Rikaikun plugin to access local files. Also, Dropbox may be a good option for easy access to these files on the go.
Last edited by shinsen (2013 April 30, 9:46 am)