Back

Mighty Morphin Morphology

Yes, because set theory is so fascinating to just browse through for people who click on a link that sounds like a children's show. ;p

Though I did eventually look at the Wikipedia entries for the Symmetric Differences/Intersection/Union thing. ;p I have to admit there's something sexy about that set stuff.

Do you think you could add more options for unknowns/iPlusN to the GUI db manager? I hesitate to make suggestions, but it could be as simple as adding sentence-level and paragraph-level information to the results in terms of location in the original text, and exporting/importing into Anki.

I think sentence level and paragraph level would be taken care of by using sentence punctuation in conjunction with line breaks. If there are multiple sentences with a single line break, that's presumably a paragraph, no? If you had 2 tallies, this would occasionally result in duplicate numbers for single sentence breaks.

I'm sure you can think of better stuff? Right now I have about 5 ideas of applying your GUI that are floating around, but they're roundabout because I'm imagining duplicating these functionalities through regex and/or importing into Anki just to get specific information before re-exporting.

Edit: Perhaps also more export formatting options? Shrug.

By the way, here's what I have so far, on the small scale:

Input, say, an article. Run it through MorphMan as database A and use known.db as B, then do A-B to get the unknowns. Copy that list and turn into regex and apply to original file:

(Using Ultraedit with Perl regexp engine, this seems to work):

Find: (Word1|Word2|Word3)

Replace: <b></b> (or whatever formatting for visual aid) (that's a backslash and a one, in case it's not showing up properly)

Then you have the original with the unknowns formatted, so you can go through with Rikaisan and create cards with their source sentences, or just get a good impression of unknowns, whathaveyou.

If there's also a regex or somesuch for Finding sentences containing those words and counting the unknowns and adding that tally to the end of a sentence, and doing so for paragraphs...

Otherwise I might end up breaking up all sentences/paragraphs, importing to Anki, running MorphMan to find unknowns/iPlusN and attach the tallies to the ends of the sentence fields, exporting back to text and fixing any formatting errors. From there I could add formatting for clicking from one i+N to another...

There's also adding clozes for unknowns in collocations, but that's neither here no there.

Know what else would be interesting? A Firefox plugin that consults the known.db to darken or highlight knowns or unknowns when you visit any webpage? Perhaps like FoxReplace you could hit a hotkey to trigger this rather than auto-enable on page load.

There's also that stuff that lets you mouseover and select page elements, perhaps that could be used to refine which areas of a page you want to process.

Edit 2: Ooh, you know what? With FoxReplace and also the Vocabulary Highlighter plugin, you can import lists in xml format. So you could turn dbs or lists of unknowns/knowns into xml format and have those load when you're browsing! With FoxReplace I bet you could really get flexibility with the visual formatting, though I think Vocab Highlighter has various options also. But if you also have interval levels, you can have different shades... Okay that might be overkill. ;p
Edited: 2011-06-28, 10:40 am
Reply

Messages In This Thread