Thanks overture, I Think I finally got it working.
overture2112 Wrote:Great! I will do that. ^^Splatted Wrote:If you want to try this now, open up the DB in a spreadsheet program or something (since it's just a tsv), use that to sort the rows by the frequency column, delete stuff that doesn't appear enough, then save it (perhaps to a new file) and use that.overture2112 Wrote:The databases actually contain frequency information. The format is basically a TSV file with 4 column morpheme entry plus a number with how many times it was seen. I'll consider adding some frequency filter to MorphMan to make use of it.I don't think there's any need for anything complicated here. Just sorting a batch of the most common words (e.g. the 1000 most common, or all the words that appear 3 or more times etc) and then sorting that in to the easiest to learn order would make huge difference.
