Back

Bookscan (have Japanese books OCR'd cheap! Anyone used this?)

#26
I saw a story on this on the TV earlier this year. The businesses are usually just a few guys in a small office space. They tear the binding off and run all the pages through an automatic scanner. Seems like pretty easy work.

Publishers are also pretty steamed at it and I think they intend to crack down on the service before it really even gets underway. With the sudden surge in popularity of ipods/ipads etc you'd think they'd pull their head out and start selling electonic versions themselves.
Reply
#27
slivir Wrote:I saw a story on this on the TV earlier this year. The businesses are usually just a few guys in a small office space. They tear the binding off and run all the pages through an automatic scanner. Seems like pretty easy work.

Publishers are also pretty steamed at it and I think they intend to crack down on the service before it really even gets underway. With the sudden surge in popularity of ipods/ipads etc you'd think they'd pull their head out and start selling electonic versions themselves.
Ah ha, but they are. It was in the 朝日新聞 not too long ago. An app like Kindle is suppose to be released by Kinokuniya early next year.
Reply
#28
I don't know if publishers are cracking down on these but... now they seem to have 40+ day waiting lists.
Reply
See this thread for Holiday Countdown Deals (until Dec 15th)
JapanesePod101
#29
I bought a plustek opticbook 3600 scanner to scan books. I have had it for a couple of years and aside from regular scanning tasks I do for my personal office, I have scanned several entire books (from my own book collection). The glass goes right to the edge so you can just hang your book over the edge and not affect the binding and not have the ugly spine shadow or distorted text you often see in cheap scans. I have even used it on some pretty old books and since you don't have to open the book more than 90 degrees to scan the page it can scan even pretty old books with fairly weak bindings. Unfortunately, the abbey lite OCR that came with it does not do Asian text but it worked great for Dutch, French and German, and Spanish, although it did affect the layout a bit, it was still quite acceptable for reading. Still, for 5 bucks, it would not be worth it to scan a whole book since this does take quite a bit of time.
Reply
#30
Bringing a thread back up from the dead here.
I'm considering using 1dollarscan for a Japanese textbook I have.
I'm wondering if anyone has any experience with this type of material and how it might turn out. Should 300dpi be good enough, or would I need to go with 600dpi. I just worry because Japanese text can be extremely tiny and detailed, so maybe 300dpi isn't enough.
And how might I expect the OCR to turn out. Since it's just normal-sized, clean looking, left-to-right text (albeit with furigana), I'm assuming it should be able to OCR pretty well. Is that a safe assumption?
Reply
#31
This is an interesting thread. I'm glad you revived it. When I OCR Japanese light novels, I've found that 300 dpi is plenty. I think you have to expect some mistakes in the OCR character recognition no matter what. I've tried e.typist 14 and Real Reader Lite 8.0 and they both have errors here and there. I'm not sure if there is anything better on the market than those two for OCRing Japanese text that those scanning companies would use. If you try one of those services, see if they would send both an image based pdf and the OCRed text so that you can refer back to the image if there is a mistake. I never thought about doing this before. I usually import the books from Japan. I then go to Kinkos where they have a machine to brake the binding cleanly. Afterwords I scan the pages with a Fuji ScanSnap. I use Real Reader Lite 8.0 to OCR. If you want to do it this way, the expensive part is the sheet feed scanner like the ScanSnap which is about $450. If you could buy from Amazon JP and send directly to the scanning company, that would be good. The savings from international shipping would quickly add up as well. How much do they charge?
Reply
#32
Thought I would give a review of 1dollarscan.com, now that they have finished with a book I sent to them.
First of all, 1dollarscan is based in California, so this one is primarily for people who live in the USA and already have some books in hand. I imagine the ones in Japan have similar options and quality though.
I had them scan a Japanese textbook for me, which contains English, Japanese, and furigana, all written left to right.

I chose the 600dpi option (because i was concerned with being able to read it clearly when I zoom in) as well as the high quality touch up option (which is supposed to provide for better ocr quality).

As far as the OCR goes for Japanese text, it was riddled with mistakes. Some pages seemed almost completely lost, while other pages had near perfect accuracy (except for furigana, which always ended up mangled).
Interestingly, the English OCR is almost consistently worse than the Japanese.

All in all, I think the basic scanning service ($1 per set) is a great value if you are just looking to get a book scanned.
The high resolution option worked as expected, but at $2 a set, it can really pump the price up. I honestly think I would have been just fine with the standard resolution of 300dpi.
And then finally, the "high quality touch up" option at $2 a set was just a complete waste of money. You might want to try the basic OCR option instead, which is only $1 per set. Or perhaps try running your own ocr software on the pdf you get from them.


Here is example text from 1 page:
Quote:風日iに(おli坊を人れる runa hot bath; fill up the bath-
tub with hot water
ホテルの部尽に反ると、すぐにお風呂にお湯を人れ
たヲ Assoon as 1 got back to my hotel room, 1 ran a hot
bath.
? ろ O ; T .1',
!重Udの(お)湯を務とす pullthe plug out of the bathtub
(and drain the water)
ふ ろ わ
風日を沸かす preparethe bath
かえ ふ ろ 1 まい b ろ わ
帰ったらすぐにお風呂に入りたいから、お風呂を沸
かしておいてね。 1want to take a bath right after 1
get home, 50 will you prepare it for me?
ふ ろ わ
思ちがiりている theba出 isrea釘
風日を(水で)うめる makethe bath water less hot (by
adding cold water)
ふ ろ はい
風日に入る takea bath
o ろ ゆ ぷ ね
風目/湯船につかる 50akin a bathtub
さ む ひ あつ ふ ろ
こんな寒い日は、熱いお風呂にゆっくりっかりたい。
On a cold day like this, 1 feellike soaking in a nice
hotbath.
ふ ろ あ
風呂から上がる getout of the bath
ひ と ふ ろ あ
一風日浴びる takea bath
ひ と か ろ あ
一風呂浴びてビールにしよう 1think 1'11 take a
bath and then have some beer.
ふ ろ あ せ な が
風呂で汗を流す wa5hoff one'5 5weat in the bath
ゆ き
お湯を冷ます 1et(hot) water coo1 down
ゆ き
お湯が冷める hotwater coo15 down

お湯をうめる addcold water (to the bath) to lower the
Edited: 2013-10-18, 1:37 pm
Reply
#33
Wow, the OCR was so bad that it even broke the forum?!
Reply
#34
Vempele Wrote:Wow, the OCR was so bad that it even broke the forum?!
LOL This is the best deterrent from using a product I have ever seen.
Reply
#35
Wow. I switched it from code tag to a quote tag, for the sake of the forum :p
Reply
#36
Which service on http://www.bookfire.net/ an allow me to scan books shipped directly from Amazon?
I see a lot of services there, I'm not sure which one to use.

EDIT:
I'm going to try http://www.s-s-sendai.info/
Wish me luck xD
Edited: 2013-10-18, 3:21 pm
Reply
#37
pmnox Wrote:Which service on http://www.bookfire.net/ an allow me to scan books shipped directly from Amazon?
I see a lot of services there, I'm not sure which one to use.

EDIT:
I'm going to try http://www.s-s-sendai.info/
Wish me luck xD
It worked great. xD
Reply
#38
pmnox Wrote:
pmnox Wrote:Which service on http://www.bookfire.net/ an allow me to scan books shipped directly from Amazon?
I see a lot of services there, I'm not sure which one to use.

EDIT:
I'm going to try http://www.s-s-sendai.info/
Wish me luck xD
It worked great. xD
The sendai place worked well? I have to try it then. Do you just have to add an id number to the shipping address so that sendai will know who the book belongs to? I would like to try this with a light novel. How much did it cost total (with OCR)?
Reply
#39
PotbellyPig Wrote:
pmnox Wrote:
pmnox Wrote:Which service on http://www.bookfire.net/ an allow me to scan books shipped directly from Amazon?
I see a lot of services there, I'm not sure which one to use.

EDIT:
I'm going to try http://www.s-s-sendai.info/
Wish me luck xD
It worked great. xD
The sendai place worked well? I have to try it then. Do you just have to add an id number to the shipping address so that sendai will know who the book belongs to? I would like to try this with a light novel. How much did it cost total (with OCR)?
The first time it's free. You will receive file in pdf format (images).

Scanning manga costs 100 yen, scanning regular book costs 150 yen.
According to http://www.bookfire.net/ OCR costs additional 100 yen.

Btw.
hillsscan24: doesn't work. the interface is broken

bookscan_jp: doesn't support Amazon without 10k yen per month subscription.

densyohonke: has poor web interface, you can't create an account there. You have to type whole data every time you use their service.

As far as I've seen almost every service offers one free scan for one volume. It's possible to buy used books on amazon as well as new ones.



In your case the total would be 250 yen. Maybe with the first time free service it would be 100 yen for OCR. Btw, OCR is just a software solution it should be possible to get some software that does that for you.
Edited: 2013-10-21, 11:13 am
Reply
#40
pmnox Wrote:
PotbellyPig Wrote:
pmnox Wrote:It worked great. xD
The sendai place worked well? I have to try it then. Do you just have to add an id number to the shipping address so that sendai will know who the book belongs to? I would like to try this with a light novel. How much did it cost total (with OCR)?
The first time it's free. You will receive file in pdf format (images).

Scanning manga costs 100 yen, scanning regular book costs 150 yen.
According to http://www.bookfire.net/ OCR costs additional 100 yen.

Btw.
hillsscan24: I was able to register after a few tries. They provide OCR for books up to 500 pager depending on number of days you want to wait.
15営業日納品(15営業日以内)180 円/1冊
通常納品(5営業日以内)240 円/1冊
特急納品(72時間以内)360 円/1冊
超速納品(24時間以内)480 円/1冊

bookscan_jp: doesn't support Amazon without 10k yen per month subscription.

densyohonke: has poor web interface, you can't create an account there. You have to type whole data every time you use their service.

As far as I've seen almost every service offers one free scan for one volume. It's possible to buy used books on amazon as well as new ones.



In your case the total would be 250 yen. Maybe with the first time free service it would be 100 yen for OCR. Btw, OCR is just a software solution it should be possible to get some software that does that for you.
I wrote the order number in the name field (氏名)

So far I got 2 volumnes scanned.
I have ordered 4 more and I'm planning to scan at least 4 more.
Edited: 2013-10-21, 12:31 pm
Reply
#41
Necro'd cause I want to get some more up to date info.

Has anyone been using these services recently? I'm trying to find a service that will do 600dpi, full color, with the option of scanning the book covers as well but so far I haven't found an active service on http://www.bookfire.net/ that will let you do that.

The book I want scanned is a novel with occasional illustrations (this book predates the term light novel but it is essentially that); those illustrations in particular I want to have scanned in as high a resolution possible. I have a physical copy but I don't want to destroy it to scan it in properly after all the trouble of shipping it overseas. I also would rather not purchase my own scanning equipment for something I don't intend to do regularly.

EDIT: Also, I don't care about OCR although I'm guessing any service that would offer what I want is gonna offer OCR anyway.
Edited: 2017-10-07, 6:39 pm
Reply
#42
Just something about OCR only:  what I personally find works well is to take a .pdf file with me to my office and OCR it there using the full Adobe Acrobat that I have at work.  (You open the file in Adobe Acrobat and select "recognize text".  You can specify what language:  e.g., English (US), Japanese, etc.).

This OCR'ing works really well for both English and Japanese.  I OCR'd a pdf copy of the book A Handbook of Japanese Grammar Patterns* (almost 800 pages) and I can now search the .pdf by kana/kanji.  It works great.  I love having this heavy book on my laptop/other devices so I can carry it around with me.

I don't do this at home because I don't want to pay $400+ for the full Adobe Acrobat.


*I bought a physical copy of the book a while back and recently someone on this forum helpfully posted a link to somewhere that had a .png image for each individual page.  I downloaded all these .png's and then using CutePDF was able to batch convert them to .pdf files then merge them all into one big .pdf file.
Reply