Back

For FLTR users, this program will conjugate verbs for you

#1
For those that don't know, FLTR is a reader used as a tool for language learning. The basic premise is that you label words, and the system highlights these words depending on the degree to which you've grown familiar with them. https://code.google.com/p/fltr/

Anyway, the program can be annoying to use in that it will force you to label every form of a verb separately, with no way to do them all simultaneously. The second problem is that adding words to your word list, manually, can be a pain, as the save format is a little strange.

So I've created a little program that automatically generates the majority of conjugations in the required format. All you have to do is copy/paste the generated information into the FLTR ***_Words.csv file. This was originally intended for personal use, but I might as well share it somewhere, in case one of the few other people that uses this program stumbles across it.

One note, I haven't bothered with the exceptions する、来る, as well as 行って. Other than that, presumably all other verbs are supported, including 「いる・える」う verb exceptions.

V. 1.1.

- Added feature that allows automatically updating your ***_Words.csv file.
- Added potential, causative and passive past forms.

- fixed issue whereby ぬ verbs were mistakenly ommited from v. 1.0.
- fixed issue of う verb potential forms not being generated

Download: https://sourceforge.net/projects/fltrjap...onjugator/



generate verbs from the dictionary form:

[Image: ifMrmQN.png]


export the information to you FLTR ***_Words.csv file:

[Image: x3sKaDe.png]


the conjugations show up in FLTR:

[Image: HPU7tBh.png]




I have also created a program that imports data from spreadsheets, and adds them directly to FLTR. However this is a little specific to me, so I haven't uploaded it. Regardless, I will if anyone is interested.
Edited: 2015-01-21, 7:41 am
Reply
#2
Neat.

Is it possible to just change the code to use base rather than inflected forms for morpheme equality testing? MorphMan does a similar thing for highlighting words various colors in Anki depending on maturity but I'd like a general tool, and having MorphMan feed FLTR a list of base forms + maturity sounds easier than reinventing the wheel.
Reply
#3
overture2112 Wrote:Neat.

Is it possible to just change the code to use base rather than inflected forms for morpheme equality testing? MorphMan does a similar thing for highlighting words various colors in Anki depending on maturity but I'd like a general tool, and having MorphMan feed FLTR a list of base forms + maturity sounds easier than reinventing the wheel.
I'm not quite sure what you mean by that. Could you explain what you're looking for in more detail.
Reply
May 16 - 30 : Pretty Big Deal: Save 31% on all Premium Subscriptions! - Sign up here
JapanesePod101
#4
Happy to see others using FLTR Smile

For me, I put the definitions of verbs in differently. Say for the word '行きます', I would put 'to go' as the definition of '行き' (since it is the word stem/base) and 'polite present' for 'ます' instead of making a new word for each conjugation.

Thank you for sharing your program. Will be nice to check to see if Im reading the conjugations right with your program.
Reply
#5
Cronos Wrote:
overture2112 Wrote:Neat.

Is it possible to just change the code to use base rather than inflected forms for morpheme equality testing? MorphMan does a similar thing for highlighting words various colors in Anki depending on maturity but I'd like a general tool, and having MorphMan feed FLTR a list of base forms + maturity sounds easier than reinventing the wheel.
I'm not quite sure what you mean by that. Could you explain what you're looking for in more detail.
^--- Yes, this sounds like it could be useful. Well, I haven't used FLTR but there are many reasons I didn't use LWT for very long. The program didn't know anything about the language itself, and it required you to add spaces to break kanji words up, which meant it didn't work at all for mixed (English and Japanese) text. And they didn't seem to care that it was cumbersome to use for Asian text. A reader with actual integration with Morphman would be such a boon.
Reply
#6
After reading this thread, I wanted to give FLTR a try.
I downloaded the .zip file (Mac OS version) but after decompressing it I was unable to launch the application (I got a message saying it is damaged and cannot be opened). Downloading again did not help.
I noted that my version of Java is 7, not 6 as specified for Mac OS users.
However, I do not want to return to a previous version because other applications are using the current one. Can two different Java versions coexist on the same machine? Or could the problem be due to something else?
Did anyone experience similar problems?

Edited
Problem solved: Mac OS users with Java 7 installed should download the "linux" version of FLTR. This was mentioned on the download page but I missed it at first.
Edited: 2015-01-17, 3:11 am
Reply
#7
jmignot Wrote:After reading this thread, I wanted to give FLTR a try.
I downloaded the .zip file (Mac OS version) but after decompressing it I was unable to launch the application (I got a message saying it is damaged and cannot be opened). Downloading again did not help.
I noted that my version of Java is 7, not 6 as specified for Mac OS users.
However, I do not want to return to a previous version because other applications are using the current one. Can two different Java versions coexist on the same machine? Or could the problem be due to something else?
Did anyone experience similar problems?
I can't imagine why Java 6 would be compatible, but not 7. That seems strange. The website does say for OSX, I don't know much about Mac, but are you using a different Mac version?
Reply
#8
Cronos Wrote:I can't imagine why Java 6 would be compatible, but not 7. That seems strange. The website does say for OSX, I don't know much about Mac, but are you using a different Mac version?
I have just downloaded the current Mac version from the FLTR web page, and my MacOS and Java versions are up to date, since I do upgrades whenever requested (except the latest OS X 10.10 "Yosemite", which is still widely criticized for causing instabilities of the wifi connections).
Reply
#9
I too am checking out FLTR after reading your post. Could you consider putting your source code on Github or such? It's a little less... creepy than MediaFire.

So, how do you compile and run your verb conjugator code? I think I compiled everything right with `$ javac -encoding utf8 Program2.java GUI.java`. This, on Win7, Java8, creates GUI$1.class, GUI$2.class, GUI.class, and Program2.class, but running `java` on any of these class files (with and without ".class" extension) just produces an error, "Error: Could not find or load main class ...". Sorry, my Java-build-fu is nil.
Reply
#10
Daichi Wrote:^--- Yes, this sounds like it could be useful. Well, I haven't used FLTR but there are many reasons I didn't use LWT for very long. The program didn't know anything about the language itself, and it required you to add spaces to break kanji words up, which meant it didn't work at all for mixed (English and Japanese) text. And they didn't seem to care that it was cumbersome to use for Asian text. A reader with actual integration with Morphman would be such a boon.
What's Morphman do?
Reply
#11
aldebrn Wrote:
Daichi Wrote:^--- Yes, this sounds like it could be useful. Well, I haven't used FLTR but there are many reasons I didn't use LWT for very long. The program didn't know anything about the language itself, and it required you to add spaces to break kanji words up, which meant it didn't work at all for mixed (English and Japanese) text. And they didn't seem to care that it was cumbersome to use for Asian text. A reader with actual integration with Morphman would be such a boon.
What's Morphman do?
Yeah my Java build knowledge is useless as well. Try opening this project folder in Netbeans. Unzip it and place it in Documents > NetbeansProjects. https://www.mediafire.com/?d9n6f23wbj54am4
Reply
#12
Before installing the patch, I wanted to test the native application first.
It turns out that, with the default configuration, each kanji is treated as a word. How can this be avoided?
Reply
#13
The idea is that the user defines words as they go. In the beginning, FLTR just treats every kanji/hiragana as a separate character. You can highlight a word, add it to the list of known/learning words, and from then on it will display the word, rather than separate characters. so 自転車 instead of 自, 転, 車.

You can also define a default dictionary to look up new words. By highlighting however many characters, it will automatically open a webpage to that dictionary to show you the word/meaning. By default I think it's google translate, but you can easily change that to Jisho or something.
Reply
#14
Cronos Wrote:The idea is that the user defines words as they go. In the beginning, FLTR just treats every kanji/hiragana as a separate character. You can highlight a word, add it to the list of known/learning words, and from then on it will display the word, rather than separate characters. so 自転車 instead of 自, 転, 車.

You can also define a default dictionary to look up new words. By highlighting however many characters, it will automatically open a webpage to that dictionary to show you the word/meaning. By default I think it's google translate, but you can easily change that to Jisho or something.
I see. Thanks !
Reply
#15
Cronos Wrote:For those that don't know, FLTR is a reader used as a tool for language learning. The basic premise is that you label words, and the system highlights these words depending on the degree to which you've grown familiar with them. https://code.google.com/p/fltr/
(…)
I have also created a program that imports data from spreadsheets, and adds them directly to FLTR. However this is a little specific to me, so I haven't uploaded it. Regardless, I will if anyone is interested.
This is Windows only, right (.exe file in download) ?
Reply
#16
jmignot Wrote:
Cronos Wrote:For those that don't know, FLTR is a reader used as a tool for language learning. The basic premise is that you label words, and the system highlights these words depending on the degree to which you've grown familiar with them. https://code.google.com/p/fltr/
(…)
I have also created a program that imports data from spreadsheets, and adds them directly to FLTR. However this is a little specific to me, so I haven't uploaded it. Regardless, I will if anyone is interested.
This is Windows only, right (.exe file in download) ?
The link in the original post (first post in this thread) is the the .exe and I included the source files.

The link I posted in this message was to the netbeans project folder, which I assume you could open in netbeans regardless of platform.
Edited: 2015-01-17, 12:41 pm
Reply
#17
Cronos Wrote:The link in the original post (first post in this thread) is the the .exe and I included the source files.

The link I posted in this message was to the netbeans project folder, which I assume you could open in netbeans regardless of platform.
Sorry but this is the first time I hear about netbeans :-(

I went to their home page but I am not sure of what I should download.
Furthermore, I am afraid that this installation might interfere with the current Java install on my Mac.
Is it harmless ?
Thanks for helping.
Reply
#18
A Java guru friend from over at Kotoba Miners enlightened me on building and running Java apps from the command line (though they said to use Eclipse/NetBeans for anything bigger).

0) If you don't want to bother with all the following, and trust me to not have messed with the code, download the resulting cross-platform (Mac/Win/Linux) executable JAR file and double-click it: https://gist.github.com/fasiha/c6f9fc306...ugator.jar

But if you want to build it yourself,

0b) Check that you have a Java Development Kit (JDK) installed or install one: "java -version" and "javac -version" should print out some version information.

1) In the "source" directory, make a directory called "program2" (case sensitive; this is the package name declared at the head of the two .java files)

2) Move/copy the .java files (and I guess the GUI.form) files to this "program2" directory

3) From the "source" directory, run "javac program2/*.java". This compiles the source code and create some .class files in "program2".

4) Still in the "source" directory, run "java program2.Program2". This launches the app.

5) Profit!!! (I typed in "観測する" under "Kanji", "かんそくする" under "Hiragana", and "1" for "Rating" and clicked the very large button to get all the verb conjugations!

6) Make a JAR (Java archive) so you don't have to do this later, and so people without the JDK (and just the JRE) can run it: still in the "source" directory, run "jar cfe FLTRJapaneseVerbConjugator.jar program2.Program2 program2".



As a grammar novice, I have to ask: how reliable are these conjugations? Are there many irregular verbs? I'll try cross-referenced this tool's output with a list of 100 verbs' conjugations by Waespym: https://docs.google.com/spreadsheets/d/1...1739289745 (forum post at http://forums.kotobaminers.org/threads/1...tions.806/).
Edited: 2015-01-18, 1:19 am
Reply
#19
Coming back to the original post: doesn't the proposed strategy, if used systematically, produce huge vocabulary files?
Then how about future exports to a Anki or another SRS application?
Should one filter out all extra lines at that stage or is there a better way?
Reply
#20
That's just how FLTR works. If you want those conjugations highlighted, you're going to need them in that word file. I can't imagine encountering any problems because of the size of the file. Made a dummy file with 5000 verbs worth of lines, which is about 150,000 lines. That equates to about 25MB. You could have a vocabulary of 10,000 nouns/adjectives/ect and 2000 verbs and that would be about 75,000 lines (12.5mb).

As for exporting to Anki, hmm. I guess that really depends on just what kind of cards you wanted to make. I, personally, would add the verb (probably just the plain form) to Anki when I first encountered it, and then add the verb and it's conjugations to FLTR.

aldebrn:

I accounted for the vast majority of irregularities. Including, to my knowledge, all of the いる・える 「う」verb exceptions. One thing I didn't bother including was the する・来る exceptions as well as the 行って・行った exceptions. Something I should get around to, should take 10 minutes. Not to mention I've realized I need to add some more past forms for causitive/passive/potential, as well as the causative-passive form and its shortening.

Hmm, would be a good idea to add a button that automatically appends the data onto your text file so you don't have to do it manually. I'll upload the updated version sometime tomorrow probably.



As a general rule of thumb, if you're unsure, look it up before you trust the program. But it should be right for 99.9% of verbs you learn. If it isn't, the only side effect will be that something won't be highlighted when you read text in FLTR.

Let me know if you notice any errors aside from the aforementioned.
Edited: 2015-01-19, 3:24 pm
Reply
#21
I have added some features.

v. 1.1

- Added feature that allows automatically updating your ***_Words.csv file.
- Added potential, causative and passive past forms.

- fixed issue whereby ぬ verbs were mistakenly ommited from v. 1.0.
- fixed issue of う verb potential forms not being generated

https://sourceforge.net/projects/fltrjap...onjugator/
Edited: 2015-01-21, 7:41 am
Reply
#22
Cronos Wrote:As for exporting to Anki, hmm. I guess that really depends on just what kind of cards you wanted to make. I, personally, would add the verb (probably just the plain form) to Anki when I first encountered it, and then add the verb and it's conjugations to FLTR.
I liked the idea that, at any time, one could decide to create an Anki deck directly by importing the ***_Words.csv file.
Perhaps an option could be added to automatically generate a copy of that file in which only plain forms of verbs would be included. This can likely be achieved by using some regex command as well.
Reply
#23
It seems to me that, however useful the conjugator may be, it leaves the basic problem unsolved, namely the fact that FLTR will treat all entries corresponding to one single verb (how about i-adjectives?) as independent.

As a result, if I understand correctly, only the forms that have actually been encoutered in some text will be promoted to higher ratings, while others will still be marked as unknown. This is, in my opinion, less than satisfactory.

In summary, FLTR itself would really need to include a proper handling of conjugated forms in the first place, or at least to allow for extra "plug-ins" to be developed for particular languages. I have not used it beyond a quick test but the above looks like a serious shortcoming.

Does anyone know if this is better taken care of by alternative software (LWT or LingQ)?
Reply
#24
jmignot Wrote:It seems to me that, however useful the conjugator may be, it leaves the basic problem unsolved, namely the fact that FLTR will treat all entries corresponding to one single verb (how about i-adjectives?) as independent.

As a result, if I understand correctly, only the forms that have actually been encoutered in some text will be promoted to higher ratings, while others will still be marked as unknown. This is, in my opinion, less than satisfactory.

In summary, FLTR itself would really need to include a proper handling of conjugated forms in the first place, or at least to allow for extra "plug-ins" to be developed for particular languages. I have not used it beyond a quick test but the above looks like a serious shortcoming.

Does anyone know if this is better taken care of by alternative software (LWT or LingQ)?
Yeah I guess it all comes down to the fact that FLTR was never made for a particular language, so it's missing a lot of features. (and the devs stopped working on it) I think LWT is much the same, LingQ is al right, but it cost money to use effectively, and it can be a pain having to use it through a web browser. Not sure how LingQ handles verb/adjective forms.

Honestly I am half tempted to create my own, specific to Japanese.

Anyway, the way I would go about updating the rating would be to enter the .csv file, find the entry and delete it, then re-enter it with a different rating. That or I might write something that allows you to change it from the program. I could also add a feature that creates a file with just the plain forms, as well as one that handles adjective forms.
Edited: 2015-01-22, 12:45 pm
Reply
#25
LingQ is only a tiny bit better than FLTR or LWT in this regard. It runs MeCab on the input text but doesn't reduce the word to its stem nor does it do any de-conjugation. So the definitions it shows for a "word" are often totally incorrect, e.g., in 「家 で 仕事 を する ように なり」, MeCab says なり the (infinitive) form of なる, but LingQ doesn't understand that so the definitions it shows are the ones people have made for なり (particle, noun, etc.). I don't mind paying $10/mo and being online to use lingq, and really like the approach and some of the material on it, but it's Japanese NLP is too primitive to be usable. And that's understandable because Japanese NLP is hard.

But I definitely think you can make a much better app that's tailored to Japanese. I've been working on one recently. It uses Ve as a linguistic frontend to MeCab. Ve is free open source software by Kimtaro who runs jisho.org, and indeed, Ve is exactly what parses sentences on beta.jisho.org (try pasting the above sentence into beta.jisho.org). In fact, I've tweeted LingQ asking them to integrate beta.jisho.org into LingQ because it is exactly what they need: very smart parsing and recombination of morphemes (raw mecab output) into Japanese words, de-conjugation of verbs, integration with all the dictionaries, etc. Kimtaro is working on a REST API for jisho.org which will be a huge help for writing Japanese LingQ clones, but right now it's not ready (tweet him asking about it Smile). The one thing, which is very nice on beta.jisho but that Kimtaro hasn't yet open sourced, is the verb de-conjugater, which does things like helpfully suggest things like "嫌いたくなくて looks like an inflection of 嫌う, with these forms: Nai-form. It indicates the negative form of the verb"---he will release it when he has time to do some cleanup with it. Maybe in the meantime Cronos's software here can be used for de-conjugation? But with Ve and MeCab, you would just (haha, "just") need to write the code to interface with JMdict and the other dictionaries, plus all the front-end stuff. I'm writing it as a webapp like LingQ (with mobile-ready datastores so you can use it on a phone while offline, and it'll sync up with the server when it gets network back), but obviously you can easily make a non-internet version too.

Sorry for this incoherent post, but to summarize it, I definitely think there's a big opportunity to make a much-improved version of LingQ for Japanese, powered by high-quality NLP tools that are all free and open source. I'd love to help someone work on this or bounce ideas and suggestions with.

Edit: I'd discussed this topic a few days ago: http://forum.koohii.com/showthread.php?p...#pid217019
Edited: 2015-01-22, 4:05 pm
Reply