I think we should get our terminology straight. I should not have used the term 外字. While it's true that the characters in question (睜 and 戢) are outside of the jouyou character set, and therefore "technically" are 外字, it's not that Shift_JIS can't display any 外字 at all. It's presumably because (and I don't know the history of East Asian font encoding methods) these two kanji were considered too rare to warrant inclusion in the Shift_JIS character set. If for instance you were to download this text (take the final slash off in your url bar or it will 404) and view it in a text editor you would find these codes. See pic. Interestingly, if you view the html version (again, remove the final slash) it renders this character 睜 as a png instead of highlightable text. Yomichan isn't at fault, it just regurgitates what you put in it. See pic. Since Yomichan is capable of rendering UTF-8, so too can it render these two characters. One last pic. I would like to see Yomichan be able to interpret the "code" so that I don't have to arduously replace the code by hand with the intended character. At this point, I'm sure you're thinking "It's very probable that these words don't even have Edict entries," and you'd be right, the kanji and the words are just too archaic. The alternative however is use that website to turn every Shift_JIS into a pdf and highlight and copy the proper characters or to use Denshi Jisho and painstakingly isolate the author's intended kanji one radical at a time. Beggars can't be choosers and Yomichan deserves all the credit in the world for streamlining vocabulary mining. Still, considering the popularity of Aozora Bunko and the sheer quantity of texts they provide, a world where Yomichan plays nice with the format of their books is better world for Japanese learners everywhere.
Edited: 2011-03-06, 7:38 pm
