I don't think the text size matters, but I'm sure the amount of the text's morphemes we're matching does. So for your purposes I think it works.
It seems to use a combination of factors, including the longest. I had just Googled it (and found a variation for English using word boundaries) and experimented and it worked well for my purposes of finding unknowns and adding visual formatting, especially as I felt the conditions of what types of words would be unknown for most users (edit: their role as function words), the contiguity of morphemes, and Rikaisan's dictionary segmentation would take care of any rare errors.
The spaces thing was incidental but I thought I'd share in this thread. ;p
It seems to use a combination of factors, including the longest. I had just Googled it (and found a variation for English using word boundaries) and experimented and it worked well for my purposes of finding unknowns and adding visual formatting, especially as I felt the conditions of what types of words would be unknown for most users (edit: their role as function words), the contiguity of morphemes, and Rikaisan's dictionary segmentation would take care of any rare errors.
The spaces thing was incidental but I thought I'd share in this thread. ;p
Edited: 2011-06-29, 11:47 am
