Wow, this is pretty amazing. ありがとうございました!
2012-10-07, 11:24 am
2012-10-07, 6:56 pm
For those having trouble launching on a Mac, I spoke to the developer through email and getting it to start is easy to do through a terminal. Navigate to the folder that has KanjiTomo and type the following:
java -jar KanjiTomo.jar -run
The "-run" flag at the end was the key.
java -jar KanjiTomo.jar -run
The "-run" flag at the end was the key.
2012-10-08, 4:19 am
There should be an option to automatically send OCR'ed text to the clipboard. It would make it really easy to use kanjitomo with other programs. Wakan, for example, monitors the clipboard and allows you to save words to you vocab list which you can export to Anki.
Advertising (Register to hide)
May 16 - 30 : Pretty Big Deal: Save 31% on all Premium Subscriptions!
- Sign up here
2012-10-08, 10:10 am
jimmyellinger Wrote:For those having trouble launching on a Mac, I spoke to the developer through email and getting it to start is easy to do through a terminal. Navigate to the folder that has KanjiTomo and type the following:There are two other flags that should be considered when launching from command line:
java -jar KanjiTomo.jar -run
The "-run" flag at the end was the key.
First, default memory limit is too low if you open large image files. "-Xmx1000m" option is recommended. This may seem a large value, but decompressed images can take lots of memory and if there are multiple images in a directory, some of them are pre-loaded to speed up page turning.
Second, "-server" is recommeded if JDK is installed. This is not critical, but it might be an improvement in some systems.
So the final command line would be:
java -server -Xmx1000m -jar KanjiTomo.jar -run
(this can be saved to launch.sh)
2012-10-08, 10:27 am
shinsen Wrote:There should be an option to automatically send OCR'ed text to the clipboard. It would make it really easy to use kanjitomo with other programs. Wakan, for example, monitors the clipboard and allows you to save words to you vocab list which you can export to Anki.This would be possible. But if you run the program in automatic mode, almost anything under mouse cursor would produce some output, so maybe it would be better to have a hotkey that sends the results to clipboard.
2012-10-08, 1:03 pm
Kurotowa Wrote:This would be possible. But if you run the program in automatic mode, almost anything under mouse cursor would produce some output, so maybe it would be better to have a hotkey that sends the results to clipboard.Ah, indeed.
Another thing I'm curious about is how hard it would be to improve the forward looking algorithm. For example, wwwjdic will parse 酔いつぶれちゃって as 酔いつぶれる (to drink oneself unconscious) but kanjitomo will only detect the first kanji. Sure, you can find your translation in the list of possible compounds that kanjitomo displays but a little extra forward parsing would make it smoother. Maybe this could be done with MeCab, or something.
2012-10-13, 5:22 am
shinsen Wrote:Another thing I'm curious about is how hard it would be to improve the forward looking algorithm. For example, wwwjdic will parse 酔いつぶれちゃって as 酔いつぶれる (to drink oneself unconscious) but kanjitomo will only detect the first kanji. Sure, you can find your translation in the list of possible compounds that kanjitomo displays but a little extra forward parsing would make it smoother. Maybe this could be done with MeCab, or something.It needs to be a little conservative when selecting match length, because too long matches might produce wrong results. For example, early version of the program read ひまになる as ひまにあかす = "to spend all of one's free time" (three characters mached). I am still tweaking the match algorithm; hopefully I find a good compromise.
Note that if it selects too few characters, you can always select more from the character lists at the top. Maybe i could add another hotkey to match more characters.
I have not been aware of MeCab, thanks for the tip, I'll check it out.
2012-10-13, 11:43 am
I have a problem with the automatic mode: how can I scroll in the main window, or copy results to the clipboard? As soon as the mouse pointer moves away from the kanji characters, the program starts searching again and the content of the window is erased.
Did I miss something?
If not, how about a hot key to freeze the current output temporarily, so that actions can be performed as in a normal window?
Did I miss something?
If not, how about a hot key to freeze the current output temporarily, so that actions can be performed as in a normal window?
Edited: 2012-10-13, 11:43 am
2012-10-14, 8:56 am
jmignot Wrote:I have a problem with the automatic mode: how can I scroll in the main window, or copy results to the clipboard? As soon as the mouse pointer moves away from the kanji characters, the program starts searching again and the content of the window is erased.It will not start searching again if you move directly from kanji to the main window without stopping the cursor.
jmignot Wrote:If not, how about a hot key to freeze the current output temporarily, so that actions can be performed as in a normal window?You can freeze the output with Alt+X, but first you must set ENABLE_HOTKEYS=1 in config.txt file.
2012-10-20, 11:44 am
Thanks. I was not fast enough!
Now, as already pointed out, an option to copy results from the main window to the clipboard could be very helpful.
Now, as already pointed out, an option to copy results from the main window to the clipboard could be very helpful.
2012-11-25, 8:50 am
I have uploaded a new version (0.9.7) to www.kanjitomo.net
These are the new features:
- clipboard support
- hotkey for adding more characters to the search
- fullscreen mode
- two-page spread mode
- file history
Results can be copied to clipboard with hotkey Alt+Z. In config.txt it's possible to select what components (kanji, kana or description) are copied.
Default hotkey for adding more characters to the search is Alt+W. Extra characters can be removed with Alt+Q.
As in last version, hotkeys must be enabled by setting ENABLE_HOTKEYS=1 in config.txt. If you have defined your own hotkeys with old version of the program, you can import your configuration from Settings -> Import settings.
Rest of the new features can be used after opening a file from the menu. Number of files saved to history can be set with FILE_HISTORY_SIZE parameter
These are the new features:
- clipboard support
- hotkey for adding more characters to the search
- fullscreen mode
- two-page spread mode
- file history
Results can be copied to clipboard with hotkey Alt+Z. In config.txt it's possible to select what components (kanji, kana or description) are copied.
Default hotkey for adding more characters to the search is Alt+W. Extra characters can be removed with Alt+Q.
As in last version, hotkeys must be enabled by setting ENABLE_HOTKEYS=1 in config.txt. If you have defined your own hotkeys with old version of the program, you can import your configuration from Settings -> Import settings.
Rest of the new features can be used after opening a file from the menu. Number of files saved to history can be set with FILE_HISTORY_SIZE parameter
Edited: 2012-11-25, 2:01 pm
2012-12-04, 10:45 am
I gave this a try today and I'm really impressed! It's really accurate, especially considering how horrible OCR often is. On first glance it looks like it's definitely accurate enough to be of great practical value as a reading help for texts without furigana. I was also surprised to see that it even works with emulators (I tried Dolphin with passive input) since DirectX/OpenGL/etc programs often conflict with screen-capturing programs like this one.
I expect this will greatly boost my possibilities for learning through reading. Kurotowa, thanks for programming and freely distributing such an awesome piece of software!
Besides functions that were already mentioned like manually drawing a boundary box, one feature that would make this even better for me would be integrated kanjidic2 (or any other good kanji data source) information (at least meanings, perhaps also some other stuff like readings, stroke count etc.) for detected kanji. Perhaps in a tooltip, or an info popup when double clicking in one of the kanji selection lists. There would hardly be a need for any other dictionary anymore!
I expect this will greatly boost my possibilities for learning through reading. Kurotowa, thanks for programming and freely distributing such an awesome piece of software!
Besides functions that were already mentioned like manually drawing a boundary box, one feature that would make this even better for me would be integrated kanjidic2 (or any other good kanji data source) information (at least meanings, perhaps also some other stuff like readings, stroke count etc.) for detected kanji. Perhaps in a tooltip, or an info popup when double clicking in one of the kanji selection lists. There would hardly be a need for any other dictionary anymore!
Edited: 2012-12-04, 11:00 am
2013-02-03, 5:12 pm
I came up with a half-baked 'real-time' mode of playing Final Fantasy on my iPhone and using KanjiTomo for words I didn't know:
1. Download Final Fantasy from iTunes store.
2. Turn on PhotoStream on your iPhone and computer
3. Set iPhone to Japanese language.
4. Launch Final Fantasy and take screen shots when you need help with text (press home + sleep at the same time).
5. Have KanjiTomo and iPhoto running on computer. As new photos popup, you can use KanjiTomo to look up words you don't know.
1. Download Final Fantasy from iTunes store.
2. Turn on PhotoStream on your iPhone and computer
3. Set iPhone to Japanese language.
4. Launch Final Fantasy and take screen shots when you need help with text (press home + sleep at the same time).
5. Have KanjiTomo and iPhoto running on computer. As new photos popup, you can use KanjiTomo to look up words you don't know.
2013-02-09, 2:13 am
very very nice man,
2013-02-10, 9:19 am
I have uploaded a new version (0.9.8) to [url="www.kanjitomo.net"/]www.kanjitomo.net[/url]
In this version I have added support for ENAMDICT (dictionary for Japanese names). You can access it by clicking the Names button or by hotkey Alt+D.
I have also improved detection of white characters over complex backgrounds (like in visual novels). Black/White RGB levels are exposed in config file (TEXT_COLOR_*_LEVEL); default values should be fine for most cases, but they can be modified if some characters cause problems. TEXT_COLOR_DEFAULT can be used to set the default text color.
Many people have requested an option to manually draw boundary boxes; I agree that this would be useful and I'm planning to add it to the next version.
In this version I have added support for ENAMDICT (dictionary for Japanese names). You can access it by clicking the Names button or by hotkey Alt+D.
I have also improved detection of white characters over complex backgrounds (like in visual novels). Black/White RGB levels are exposed in config file (TEXT_COLOR_*_LEVEL); default values should be fine for most cases, but they can be modified if some characters cause problems. TEXT_COLOR_DEFAULT can be used to set the default text color.
Many people have requested an option to manually draw boundary boxes; I agree that this would be useful and I'm planning to add it to the next version.
2013-02-10, 9:24 am
Does anyone know a good resource that would list Japanese names by their approximate frequency of use? This would help when sorting the results.
I have found these sites, but I wonder if there are better ones:
http://www.alles.or.jp/~tsuyama/name.htm
http://5go.biz/sei/cgi/ninki1.htm
http://5go.biz/sei/cgi/ninki2.htm
I have found these sites, but I wonder if there are better ones:
http://www.alles.or.jp/~tsuyama/name.htm
http://5go.biz/sei/cgi/ninki1.htm
http://5go.biz/sei/cgi/ninki2.htm
2013-02-10, 12:10 pm
I just tried this program and I have to say it's pretty awesome. It worked well on recognizing characters from a light novel page scan. Though, I was wondering if the dictionary searches could also take into account conjugations like Rikaisama, Yomichan, etc. does. So If a verb is conjugated, it can find it. I know that those other programs find the definition and have a litte note appended that it was conjugated. I think that even with just 4 characters, the program could make a good guess.
Also, I don't know if you have an interest in this, but being able to save the definitions into a tab separated file would really be a boom to anki users and others who want to review the words they ddidn't know while they were reading. Right now I have to cut, break up into columns and then paste the definitions into a spreadsheet. Maybe have a "Save" hotkey to just dump the current definition into a text file. The text file name could be set in preferences. The format could be "KANJI<tab>READING<tab>DEFINITION" in its simpliest form.
Also, I don't know if you have an interest in this, but being able to save the definitions into a tab separated file would really be a boom to anki users and others who want to review the words they ddidn't know while they were reading. Right now I have to cut, break up into columns and then paste the definitions into a spreadsheet. Maybe have a "Save" hotkey to just dump the current definition into a text file. The text file name could be set in preferences. The format could be "KANJI<tab>READING<tab>DEFINITION" in its simpliest form.
2013-02-10, 12:27 pm
Is there an option to disable the dictionary lookups and OCR more of the sentence?
It would be nice to be able to take snippets of full sentences from manga, light novels, and possibly even visual novels. (There's a tool to do this on Windows but not on Mac.)
It would be nice to be able to take snippets of full sentences from manga, light novels, and possibly even visual novels. (There's a tool to do this on Windows but not on Mac.)
2013-02-11, 4:21 pm
PotbellyPig Wrote:Also, I don't know if you have an interest in this, but being able to save the definitions into a tab separated file would really be a boom to anki usersYes, I have already been thinking about this. It should not be hard to implement.
PotbellyPig Wrote:I was wondering if the dictionary searches could also take into account conjugations like Rikaisama, Yomichan, etc. does.Implementing conjucations would be more difficult, so it's not on top of my todo list, but I would like to look into it at some point.
2013-02-11, 4:37 pm
tokyostyle Wrote:Is there an option to disable the dictionary lookups and OCR more of the sentence?It would be possible to extend the number of matched characters a bit, but most likely I'm not extending it to whole sentences in the near future. At the moment, focus of this program is to be an interactive tool to be used while reading; four characters are usually enough for that purpose.
OCR is run in parallel, so having more than four character would also be slow because most people have dual or quad core CPU
.
2013-02-12, 1:06 am
Kurotowa Wrote:OCR is run in parallel, so having more than four character would also be slow because most people have dual or quad core CPUIt's extremely fast and smooth on my Mac retina. Very impressive stuff. Are you going to make the underlying library available at all? I don't really mean open source, but something that could be licensed even? There's a huge need for this as most of the options, both commercial and open source, suck especially if you need non-Windows support..
2013-02-13, 5:02 pm
tokyostyle Wrote:Are you going to make the underlying library available at all? I don't really mean open source, but something that could be licensed even? There's a huge need for this as most of the options, both commercial and open source, suck especially if you need non-Windows support.It is possible that I release the OCR package as a library at some point (I have not yet decided what kind on license it would have), but my priority right now is completing the program. There are still some major features KanjiTomo is lacking and I would like to implement them first before thinking about the future too much.
2013-02-13, 8:19 pm
Kurotowa Wrote:It is possible that I release the OCR package as a library at some point (I have not yet decided what kind on license it would have), but my priority right now is completing the program. There are still some major features KanjiTomo is lacking and I would like to implement them first before thinking about the future too much.You are a genius! Of course, I find myself wanting little things like easier copy/paste, fewer NullPointerExceptions, etc., but that's because this is such a great product that it's becoming practically indispensable.
How come it's so much better than Kanjikit OCR? How come it's so much better than MODI? How come it's so much better than Capture2Text? Where did this technology come from? Aliens?
2013-04-21, 8:20 am
I have been working with option to manually draw boxes around characters. It has taken more time than I expected, but I have now written most of the required code for version 0.9.9.
In open file mode, you can now simply click and drag a box around characters. This doesn't work if a file is not open, but there is a new Zoom Frame where it's possible to draw boxes. Just click the Zoom button and drag the window over text from title bar or with middle mouse button.
I have not yet published the new version in KanjiTomo web page, but if you want to try it out, you can download a beta version from: http://www.kanjitomo.net/KanjiTomo_0.9.9_beta.zip
Following features will be implemented but they are not in the beta version:
- drawing boxes around horizontal or white text
- ability to change the number of characters detected
- hotkey to open zoom frame
In open file mode, you can now simply click and drag a box around characters. This doesn't work if a file is not open, but there is a new Zoom Frame where it's possible to draw boxes. Just click the Zoom button and drag the window over text from title bar or with middle mouse button.
I have not yet published the new version in KanjiTomo web page, but if you want to try it out, you can download a beta version from: http://www.kanjitomo.net/KanjiTomo_0.9.9_beta.zip
Following features will be implemented but they are not in the beta version:
- drawing boxes around horizontal or white text
- ability to change the number of characters detected
- hotkey to open zoom frame
2013-06-16, 10:07 am
First, a BIG thanks for the program. I just recently did a review of several paid OCR programs that claim Japanese support, and yours easily beat all of them for accuracy. Which suggests that they ought to be contacting you if they are going to continue to sell their programs, but I suspect that most of them are just fine with the current level of support since they are looking for features to claim rather than actual functionality.
In any case, I have been very happy with your .99 beta, but am wondering if there is any way to set it up so the display of the characters is larger. On my laptop screen I can get close enough to make out the smaller differences between two complex kanji, but when I work with my main computers (both of which are hooked up to HDTVs as monitors), I find myself having to walk up to the screen to check out the characters if I'm double checking the automatic selection. In Chrome I can just set the webpage display to 150% when reading sites, but obviously that doesn't help here. AN option to choose a larger "font" size for the Japanese font would be helpful.
Would also add a second vote for the capability to turn off the dictionary lookup and just do "pure" OCR if that's possible. Understand the main point of the tool is to help with learning, but it seems like I am using the program (rightly or wrongly) as often to just grab a couple lines to work with later as to try and puzzle out while I'm scanning.
Lastly, and I suspect your location has something to do with this, would you consider adding a donation link to your website? Given the effort you've put into this, it seems appropriate to drop a few bucks toward thanking you :-)
Thanks again for a great tool!
In any case, I have been very happy with your .99 beta, but am wondering if there is any way to set it up so the display of the characters is larger. On my laptop screen I can get close enough to make out the smaller differences between two complex kanji, but when I work with my main computers (both of which are hooked up to HDTVs as monitors), I find myself having to walk up to the screen to check out the characters if I'm double checking the automatic selection. In Chrome I can just set the webpage display to 150% when reading sites, but obviously that doesn't help here. AN option to choose a larger "font" size for the Japanese font would be helpful.
Would also add a second vote for the capability to turn off the dictionary lookup and just do "pure" OCR if that's possible. Understand the main point of the tool is to help with learning, but it seems like I am using the program (rightly or wrongly) as often to just grab a couple lines to work with later as to try and puzzle out while I'm scanning.
Lastly, and I suspect your location has something to do with this, would you consider adding a donation link to your website? Given the effort you've put into this, it seems appropriate to drop a few bucks toward thanking you :-)
Thanks again for a great tool!

