Bookscan (have Japanese books OCR'd cheap! Anyone used this?)

Index » Learning resources

 
slivir Member
From: Japan Registered: 2009-01-26 Posts: 84

I saw a story on this on the TV earlier this year. The businesses are usually just a few guys in a small office space. They tear the binding off and run all the pages through an automatic scanner. Seems like pretty easy work.

Publishers are also pretty steamed at it and I think they intend to crack down on the service before it really even gets underway. With the sudden surge in popularity of ipods/ipads etc you'd think they'd pull their head out and start selling electonic versions themselves.

Ryuujin27 Member
Registered: 2006-12-14 Posts: 824

slivir wrote:

I saw a story on this on the TV earlier this year. The businesses are usually just a few guys in a small office space. They tear the binding off and run all the pages through an automatic scanner. Seems like pretty easy work.

Publishers are also pretty steamed at it and I think they intend to crack down on the service before it really even gets underway. With the sudden surge in popularity of ipods/ipads etc you'd think they'd pull their head out and start selling electonic versions themselves.

Ah ha, but they are. It was in the 朝日新聞 not too long ago. An app like Kindle is suppose to be released by Kinokuniya early next year.

Reply #28 - 2011 April 19, 1:21 am
vosmiura Member
From: SF Bay Area Registered: 2006-08-24 Posts: 1085

I don't know if publishers are cracking down on these but... now they seem to have 40+ day waiting lists.

Advertising (register and sign in to hide this)
JapanesePod101 Sponsor
 
Reply #29 - 2011 April 19, 1:46 am
jbudding Member
From: Las Vegas, Nevada Registered: 2007-03-24 Posts: 52

I bought a plustek opticbook 3600 scanner to scan books. I have had it for a couple of years and aside from regular scanning tasks I do for my personal office, I have scanned several entire books (from my own book collection). The glass goes right to the edge so you can just hang your book over the edge and not affect the binding and not have the ugly spine shadow or distorted text you often see in cheap scans. I have even used it on some pretty old books and since you don't have to open the book more than 90 degrees to scan the page it can scan even pretty old books with fairly weak bindings. Unfortunately, the abbey lite OCR that came with it does not do Asian text but it worked great for Dutch, French and German, and Spanish, although it did affect the layout a bit, it was still quite acceptable for reading. Still, for 5 bucks, it would not be worth it to scan a whole book since this does take quite a bit of time.

Reply #30 - 2013 October 04, 3:38 pm
Zarxrax Member
From: North Carolina Registered: 2008-03-24 Posts: 949

Bringing a thread back up from the dead here.
I'm considering using 1dollarscan for a Japanese textbook I have.
I'm wondering if anyone has any experience with this type of material and how it might turn out. Should 300dpi be good enough, or would I need to go with 600dpi. I just worry because Japanese text can be extremely tiny and detailed, so maybe 300dpi isn't enough.
And how might I expect the OCR to turn out. Since it's just normal-sized, clean looking, left-to-right text (albeit with furigana), I'm assuming it should be able to OCR pretty well. Is that a safe assumption?

Reply #31 - 2013 October 04, 5:41 pm
PotbellyPig Member
From: New York Registered: 2012-01-29 Posts: 337

This is an interesting thread.  I'm glad you revived it.  When I OCR Japanese light novels,  I've found that 300 dpi is plenty.  I think you have to expect some mistakes in the OCR character recognition no matter what.  I've tried e.typist 14 and Real Reader Lite 8.0 and they both have errors here and there.  I'm not sure if there is anything better on the market than those two for OCRing Japanese text that those scanning companies would use.  If you try one of those services, see if they would send both an image based pdf and the OCRed text so that you can refer back to the image if there is a mistake.  I never thought about doing this before.  I usually import the books from Japan.  I then go to Kinkos where they have a machine to brake the binding cleanly.  Afterwords I scan the pages with a Fuji ScanSnap.  I use Real Reader Lite 8.0 to OCR.  If you want to do it this way, the expensive part is the sheet feed scanner like the ScanSnap which is about $450.  If you could buy from Amazon JP and send directly to the scanning company, that would be good.  The savings from international shipping would quickly add up as well.  How much do they charge?

Zarxrax Member
From: North Carolina Registered: 2008-03-24 Posts: 949

Thought I would give a review of 1dollarscan.com, now that they have finished with a book I sent to them.
First of all, 1dollarscan is based in California, so this one is primarily for people who live in the USA and already have some books in hand. I imagine the ones in Japan have similar options and quality though.
I had them scan a Japanese textbook for me, which contains English, Japanese, and furigana, all written left to right.

I chose the 600dpi option (because i was concerned with being able to read it clearly when I zoom in) as well as the high quality touch up option (which is supposed to provide for better ocr quality).

As far as the OCR goes for Japanese text, it was riddled with mistakes. Some pages seemed almost completely lost, while other pages had near perfect accuracy (except for furigana, which always ended up mangled).
Interestingly, the English OCR is almost consistently worse than the Japanese.

All in all, I think the basic scanning service ($1 per set) is a great value if you are just looking to get a book scanned.
The high resolution option worked as expected, but at $2 a set, it can really pump the price up. I honestly think I would have been just fine with the standard resolution of 300dpi.
And then finally, the "high quality touch up" option at $2 a set was just a complete waste of money. You might want to try the basic OCR option instead, which is only $1 per set. Or perhaps try running your own ocr software on the pdf you get from them.


Here is example text from 1 page:

風日iに(おli坊を人れる runa hot bath; fill up the bath-
tub with hot water
ホテルの部尽に反ると、すぐにお風呂にお湯を人れ
たヲ Assoon as 1 got back to my hotel room, 1 ran a hot
bath.
? ろ O ; T  .1',
!重Udの(お)湯を務とす pullthe plug out of the bathtub
(and drain the water)
ふ ろ     わ
風日を沸かす preparethe bath
かえ ふ ろ 1 まい b ろ わ
帰ったらすぐにお風呂に入りたいから、お風呂を沸
かしておいてね。 1want to take a bath right after 1
get home, 50 will you prepare it for me?
ふ ろ    わ
思ちがiりている theba出 isrea釘
風日を(水で)うめる makethe bath water less hot (by
adding cold water)
ふ ろ    はい
風日に入る takea bath
o    ろ       ゆ ぷ ね
風目/湯船につかる 50akin a bathtub
さ む ひ      あつ        ふ ろ
こんな寒い日は、熱いお風呂にゆっくりっかりたい。
On a cold day like this, 1 feellike soaking in a nice
hotbath.
ふ ろ       あ
風呂から上がる getout of the bath
ひ と ふ ろ あ
一風日浴びる takea bath
ひ と か ろ あ
一風呂浴びてビールにしよう  1think 1'11 take a
bath and then have some beer.
ふ ろ    あ せ な が
風呂で汗を流す wa5hoff one'5 5weat in the bath
ゆ き
お湯を冷ます 1et(hot) water coo1 down
ゆ き
お湯が冷める hotwater coo15 down

お湯をうめる addcold water (to the bath) to lower the

Last edited by Zarxrax (2013 October 18, 1:37 pm)

Vempele Member
Registered: 2013-06-16 Posts: 615

Wow, the OCR was so bad that it even broke the forum?!

rahsoul Member
Registered: 2012-02-29 Posts: 63

Vempele wrote:

Wow, the OCR was so bad that it even broke the forum?!

LOL  This is the best deterrent from using a product I have ever seen.

Reply #35 - 2013 October 18, 1:37 pm
Zarxrax Member
From: North Carolina Registered: 2008-03-24 Posts: 949

Wow. I switched it from code tag to a quote tag, for the sake of the forum :p

Reply #36 - 2013 October 18, 1:58 pm
pmnox Member
From: USA Registered: 2010-11-08 Posts: 221

Which service on http://www.bookfire.net/ an allow me to scan books shipped directly from Amazon?
I see a lot of services there, I'm not sure which one to use.

EDIT:
I'm going to try http://www.s-s-sendai.info/
Wish me luck xD

Last edited by pmnox (2013 October 18, 3:21 pm)

pmnox Member
From: USA Registered: 2010-11-08 Posts: 221

pmnox wrote:

Which service on http://www.bookfire.net/ an allow me to scan books shipped directly from Amazon?
I see a lot of services there, I'm not sure which one to use.

EDIT:
I'm going to try http://www.s-s-sendai.info/
Wish me luck xD

It worked great. xD

PotbellyPig Member
From: New York Registered: 2012-01-29 Posts: 337

pmnox wrote:

pmnox wrote:

Which service on http://www.bookfire.net/ an allow me to scan books shipped directly from Amazon?
I see a lot of services there, I'm not sure which one to use.

EDIT:
I'm going to try http://www.s-s-sendai.info/
Wish me luck xD

It worked great. xD

The sendai place worked well?  I have to try it then.  Do you just have to add an id number to the shipping address so that sendai will know who the book belongs to?  I would like to try this with a light novel.  How much did it cost total (with OCR)?

pmnox Member
From: USA Registered: 2010-11-08 Posts: 221

PotbellyPig wrote:

pmnox wrote:

pmnox wrote:

Which service on http://www.bookfire.net/ an allow me to scan books shipped directly from Amazon?
I see a lot of services there, I'm not sure which one to use.

EDIT:
I'm going to try http://www.s-s-sendai.info/
Wish me luck xD

It worked great. xD

The sendai place worked well?  I have to try it then.  Do you just have to add an id number to the shipping address so that sendai will know who the book belongs to?  I would like to try this with a light novel.  How much did it cost total (with OCR)?

The first time it's free. You will receive file in pdf format (images).

Scanning manga costs 100 yen, scanning regular book costs 150 yen.
According to http://www.bookfire.net/ OCR costs additional 100 yen.

Btw.
hillsscan24: doesn't work. the interface is broken

bookscan_jp: doesn't support Amazon without 10k yen per month subscription.

densyohonke: has poor web interface, you can't create an account there. You have to type whole data every time you use their service.

As far as I've seen almost every service offers one free scan for one volume. It's possible to buy used books on amazon as well as new ones.



In your case the total would be 250 yen. Maybe with the first time free service it would be 100 yen for OCR. Btw, OCR is just a software solution it should be possible to get some software that does that for you.

Last edited by pmnox (2013 October 21, 11:13 am)

pmnox Member
From: USA Registered: 2010-11-08 Posts: 221

pmnox wrote:

PotbellyPig wrote:

pmnox wrote:

It worked great. xD

The sendai place worked well?  I have to try it then.  Do you just have to add an id number to the shipping address so that sendai will know who the book belongs to?  I would like to try this with a light novel.  How much did it cost total (with OCR)?

The first time it's free. You will receive file in pdf format (images).

Scanning manga costs 100 yen, scanning regular book costs 150 yen.
According to http://www.bookfire.net/ OCR costs additional 100 yen.

Btw.
hillsscan24: I was able to register after a few tries. They provide OCR for books up to 500 pager depending on number of days you want to wait.
15営業日納品(15営業日以内)180 円/1冊
通常納品(5営業日以内)240 円/1冊
特急納品(72時間以内)360 円/1冊
超速納品(24時間以内)480 円/1冊

bookscan_jp: doesn't support Amazon without 10k yen per month subscription.

densyohonke: has poor web interface, you can't create an account there. You have to type whole data every time you use their service.

As far as I've seen almost every service offers one free scan for one volume. It's possible to buy used books on amazon as well as new ones.



In your case the total would be 250 yen. Maybe with the first time free service it would be 100 yen for OCR. Btw, OCR is just a software solution it should be possible to get some software that does that for you.

I wrote the order number in the name field (氏名)

So far I got 2 volumnes scanned.
I have ordered 4 more and I'm planning to scan at least 4 more.

Last edited by pmnox (2013 October 21, 12:31 pm)