• 1
 
Reply #1 - 2013 July 09, 4:13 pm
falsinsoft New member
Registered: 2013-07-09 Posts: 9

Hi all

I'm a japanese student and also a programmer. I recently found the mecab library during a search for know if there is a way to "parse" japanese text (basically put spaces between words). I found this library very useful and I thought to develop a free tool using this library for help me and other students like me. I made also some addition and I think I'll improve in the future. However now I have the first beta version and I need the help of some japanese expert in the forum for test it. Currently the tool have two main section.

1 - Parse japanese text using mecab engine and show text with furigana
2 - converter between romaji, hiragana and katakana made by me

Now regarding the point one I have the following problem: mecab parse japanese text in various chunks. The problem is I would to obtain a japanese text with spaces between words but mecab separate too much. For make an example a verb is splitted in the main radix and the suffix. In this case I want to show these two chunks as a single words like in the reality is. I can insert the rules for make these conjuctions but I don't know the rules since I'm at beginner level. If someone want to help he should to insert various text and, in case of no required space, report me the rule to apply and the test text for allow me to reproduce the issue and verify in the fix is correct.

Regarding the point two I tried to apply all the conversion rules I know between kana and romaji but I have some difficult to know if I worked well concerning the extended katakana that is, for me, the most difficult part since there is variations based to the translation mode to use (Helburn and others). Also in this case error report will be welcomed.

The link for download the tool is the following:

https://dl.dropboxusercontent.com/u/64769600/jtool.zip

The file is around 30MB since it have ipadic dictionary inside. Simply unzip into a folder and launch j-tool.exe

Thank you to all the people will want to help me smile  smile  smile

Reply #2 - 2013 July 10, 4:53 am
falsinsoft New member
Registered: 2013-07-09 Posts: 9

Next step, after "stabilized" the text parsing, will be to add the feature for import content from edict japanese dictionary. Once created a database of dictionary will be possible to show the translation of each single word maybe below furigana...

Obviously each suggestions for useful features to add will be welcomed. wink

Reply #3 - 2013 July 10, 5:32 am
buonaparte Member
Registered: 2010-11-25 Posts: 797

falsinsoft wrote:

2 - converter between romaji, hiragana and katakana made by me

Have a look at this
http://www.sharktime.com/us_wReplace.html

wReplace is useful for language learning. It allows you to convert between different notations/writing systems, and to approximately phonetically transcribe text. Possible applications:

    Japanese, text conversion both ways:
        Romaji ↔ Hiragana,
        Romaji ↔ Katakana,
        Katakana ↔ Hiragana.
    Russian; Cyrillic, conversion into Latin phonetic transcription ISO 9-1995.

wReplace can be used free of charge.

It can convert HUUUGe texts in a jiffy.

Advertising (register and sign in to hide this)
JapanesePod101 Sponsor
 
Reply #4 - 2013 July 10, 6:53 am
falsinsoft New member
Registered: 2013-07-09 Posts: 9

Hi

Thank you for your reply even if I do not understand the meaning very well. I already made a conversion engine between kana and kanji, I don't need it. Regarding the basic roules I'm quite sure to worked correctly. My doubs concern the different way to "translate" sounds non originally present in Japanese language. This is covered, from what I understoond, by the use of the extended katakana as reported in the link below:

http://en.wikipedia.org/wiki/Hepburn_romanization

I'm not sure to have applied correctly all the rules than I'm looking for someone able to test the tool and report me problems...

Last edited by falsinsoft (2013 July 10, 6:59 am)

Reply #5 - 2013 July 10, 3:28 pm
falsinsoft New member
Registered: 2013-07-09 Posts: 9

If someone is interested here there is a screenshot of the tool (don't ask me what is written in japanese since I got this random text from an online site smile )

https://lh4.googleusercontent.com/-52LJGWk3__w/Ud3DjEVmqoI/AAAAAAAABAI/xvuwLEWF9CU/s640/j-tool-1.0-beta-1.jpg

Last edited by falsinsoft (2013 July 10, 3:30 pm)

falsinsoft New member
Registered: 2013-07-09 Posts: 9

Hi all

First official 0.1 version of thsi tool has been released. Please, if some japanese expert will can, test it and report me the incorrect things. New features will be added in future. ^_^

Here page info:

http://falsinsoft-software.blogspot.com … -tool.html

Here download link:

http://www.softpedia.com/get/PORTABLE-S … Tool.shtml

Screenshot

http://3.bp.blogspot.com/-JfBItuZryQ4/UjYcbrMvAPI/AAAAAAAABB8/5t-9F7BtEqo/s640/j-tool-.01-2.jpg

wahnfrieden Member
From: Boston Registered: 2008-08-19 Posts: 56

Haven't tried it but:

- You're not annotating numbers correctly, you're annotating each digit separately as if it were "one one" instead of "eleven".

- Furigana (the transliterating kana) traditionally goes on top of the kanji, when written left-to-right like this. Having it below feels unnatural.

  • 1