Hello everyone,
I was wondering if anyone knew how to compare various Japanese texts and see if there are matches in the sentences (and obviously see what the matches are).
I have an archive of texts (around 100) and I wanted to compare each one of them against the others to see if there were sentences that were exactly the same (it would be nice to be able to set some parameters such as minimum length, exact match or similar etc.).
I have found this piece of software, called Corsis (formerly Tenka text), but the documentation is not yet full and I am not able to use it for my purpose.
I also tried with the anti copycatting software but they generally don't handle Japanese characters or are too expensive.
Thanks for any help and suggestion.
Cheers.
I was wondering if anyone knew how to compare various Japanese texts and see if there are matches in the sentences (and obviously see what the matches are).
I have an archive of texts (around 100) and I wanted to compare each one of them against the others to see if there were sentences that were exactly the same (it would be nice to be able to set some parameters such as minimum length, exact match or similar etc.).
I have found this piece of software, called Corsis (formerly Tenka text), but the documentation is not yet full and I am not able to use it for my purpose.
I also tried with the anti copycatting software but they generally don't handle Japanese characters or are too expensive.
Thanks for any help and suggestion.
Cheers.
