![]() |
|
Using RegEx to get rid of kana in () in articles - Printable Version +- kanji koohii FORUM (http://forum.koohii.com) +-- Forum: Learning Japanese (http://forum.koohii.com/forum-4.html) +--- Forum: Learning resources (http://forum.koohii.com/forum-9.html) +--- Thread: Using RegEx to get rid of kana in () in articles (/thread-6499.html) Pages:
1
2
|
Using RegEx to get rid of kana in () in articles - Asriel - 2010-10-10 @aphasiac Cool, I'll check it out. I was just thinking to myself "that method seem to make sense, but what if you're using things that don't have that format?" Then again...the point is to delete the furigana anyway... Meh, I'll still check it out. Using RegEx to get rid of kana in () in articles - Asriel - 2010-10-10 Alright, the plugin aphaisac linked to basically just takes a page and turns it into a "kids newspaper" format (ie 学校(がっこう)) but then you have to download this addon: https://addons.mozilla.org/en-US/firefox/addon/6812/ To make it work like real furigana. I'm guessing that this is what Chrome etc. do by default. So for what it's worth, everyone, if you want to read real newspaper (not kids) just download the two plugins mentioned so far. Plus, you can also choose how many kanji they add furigana to. Using RegEx to get rid of kana in () in articles - rich_f - 2010-10-10 Okay, I ran into one last snag. Yomiuri Online's website has some pages where the stuff in () uses EN parentheses. So I changed the regex to: \([\u3040-\u30FF]+\) And it worked like a charm. I'll edit the first post so it's all in one place. For me, the whole point of the exercise is to find newswriting targeted at a lower grade level, so I went for kids' newspaper stuff. I don't want furigana or any sort of reading hints, hence the regex attack on readings. I looked at Chrome, but when you copy/paste from Chrome into EPP, it looks like a dog's dinner. It's just easier to go from FireFox minus any extensions, copy/paste to EPP, remove the readings with regex, then dump the result into Evernote, read the article, and markup/copy anything I need to add to my Anki deck for vocab. Using RegEx to get rid of kana in () in articles - Asriel - 2010-10-10 Yeah sorry, I didn't mean to derail anything...Just kind of skimmed the topic and blog post and thought 'what if?' and that's it. carry on |