Joined: Feb 2008
Posts: 1,322
Thanks:
0
@aphasiac Cool, I'll check it out. I was just thinking to myself "that method seem to make sense, but what if you're using things that don't have that format?"
Then again...the point is to delete the furigana anyway...
Meh, I'll still check it out.
Joined: Jul 2007
Posts: 1,879
Thanks:
19
Okay, I ran into one last snag. Yomiuri Online's website has some pages where the stuff in () uses EN parentheses.
So I changed the regex to: \([\u3040-\u30FF]+\)
And it worked like a charm.
I'll edit the first post so it's all in one place.
For me, the whole point of the exercise is to find newswriting targeted at a lower grade level, so I went for kids' newspaper stuff. I don't want furigana or any sort of reading hints, hence the regex attack on readings.
I looked at Chrome, but when you copy/paste from Chrome into EPP, it looks like a dog's dinner. It's just easier to go from FireFox minus any extensions, copy/paste to EPP, remove the readings with regex, then dump the result into Evernote, read the article, and markup/copy anything I need to add to my Anki deck for vocab.
Edited: 2010-10-10, 11:10 am
Joined: Feb 2008
Posts: 1,322
Thanks:
0
Yeah sorry, I didn't mean to derail anything...Just kind of skimmed the topic and blog post and thought 'what if?' and that's it.
carry on