Since Feb 2005 / Last update: Jan. 23, 2013
Project page at Freecode.
[June 15, 2012]
WeOCR-toolkit ver.0.14 has been released.
[Sept. 7, 2011]
In order to reduce confusions and gain a higher
compatibility with some recent applications,
the language code has been changed from ISO639-2 to ISO639-3.
This change would affect to the server search functions.
[Sept. 7, 2011]
A new WeOCR-compatible server
OCRextrACT has arrived.
[May 3, 2010]
WWWJDIC for Android -
Android frontend for Jim Breen's WWWJDIC
is using WeOCR services. (Thanks, Nikolay.)
[Feb 8, 2010]
C'est What? -
A mobile OCR & translation app for iPhone and Android
is using WeOCR services. (Thanks, David.)
[Jan 13, 2010]
one of our WeOCR servers,
now supports "single character recognition" mode.
[Jan 13, 2010]
Kanji Yomi and
are both using WeOCR services. (Thanks, inda3)
[Aug 23, 2009]
WordSnap OCR - Proof-of-concept application for word input
using camera on Android has been released.
It uses WeOCR services. (Thanks, Spiros.)
You can help visually disabled people through WeOCR. See here:
WeOCR is a platform for Web-enabled OCR
(Optical Character Reader/Recognition) systems
that enables people to use character recognition over networks.
A WeOCR server receives document images from users,
recognize texts in the images, and return recognition results to the users.
WeOCR does not have its own character recognition engine.
Instead, it is intended to accommodate various character recognition engines.
WeOCR provides a simplified user interface
so that more people can benefit from OCR easily.
Although some people would worry about the privacy of their documents,
we think there are still a lot of applications of
OCR in which privacy does not matter.
We hope WeOCR will expand the range of OCR applications further.
- Design the architecture of WeOCR.
- Develop a toolkit that enables OCR developers and researchers
build their own Web-based OCR sites easily.
- Encourage people to develop OCRs for various languages
and to open them to the public
either as a Web service or as a Free Software.
- Make some useful tools and libraries for Web-based OCR systems.
WeOCR-toolkit has the following features.
- Receive a document image from each client computer,
pass the image to the back-end OCR engine,
generate HTML data from the result data,
and send the data back to the client.
- Uncompress the incoming image file if required.
- Limit the size of the input data to protect the server
from huge data.
- Examine the integrity of image file headers.
- Convert the input image into a common image format (PNM).
- Limit the number of jobs to prevent the server from
processing too many documents at once
and to maintain acceptable server response.
- Terminate the OCR engine after a specified time has passed,
if the engine continues running (in vain) due to
unexpected input data or bugs in the engine.
- Support server search function using spec files in XML.
The license is the
Apache License, Version 2.0.
(An MIT-X derivative applies to weocr-toolkit-0.12 and older.)
You don't need to open the source codes of your
OCR engine to the public, if you wish so.
- Deploy more WeOCR servers. (ASAP)
- Advertisement! (ASAP)
- Encourage researchers/developers to provide their own WeOCR services. (ASAP)
- Find open source OCRs for various languages. (midterm)
- Improve the UI. (midterm)
- Write documentations. (midterm)
Develop an OCR for Japanese (ASAP)
- ... etc.
- [Jun 15, 2012]
- WeOCR-toolkit ver.0.14 has been released.
- [Apr 26, 2009]
- WeOCR-toolkit ver.0.13 has been released.
- [Sep 26, 2008]
- [Sep 9, 2008]
- [Aug 19, 2008]
- WeOCR-toolkit ver.0.12 has been released.
- [May 12, 2007]
- WeOCR-toolkit ver.0.11 has been released.
- [Apr 8, 2007]
- [Feb 12, 2007]
- The server search CGI can now produce server lists in XML.
This would be useful for various web applications using WeOCR.
Pass parameter "fmt=xml" to the CGI.
- [Aug 18, 2006]
- An automatic spec collector is now up and running.
Once your server is registered for the server list,
your spec file will be examined periodically (twice a day)
and used for updating the list.
- [Jun 26, 2006]
- WeOCR-toolkit ver.0.10 has been released.
- [Jun 9, 2006]
- Hebrew OCR (hocr) has been added (see
- [Jun 7, 2006]
- [Feb 26, 2006]
- WeOCR-toolkit ver.0.10beta has been released.
- [Feb 19, 2006]
- The OCR engine at ocr1/e1 has been updated to ocrad-0.14.
- [Jan 22, 2006]
- WeOCR-toolkit has been released (at last).
- [Jan 18, 2006]
- The project has been renamed, since the previous name ocrweb was
too popular in another community.
- [Nov 6, 2005]
- A new server with GOCR has been released.
- [Oct 14, 2005]
- The OCR engine at ocr1 has been updated to ocrad-0.13.
- A filter for adaptive thresholding has been added.
- [Oct 7, 2005]
- [Sep 22, 2005]
- JPEG (JFIF) support has been added.
- [Aug 28, 2005]
- Some modifications to internal codes. (No new feature.)
- [Jun 10, 2005]
- The OCR engine used at ocr1 has been updated to ocrad-0.12,
which runs much faster.
- ocr1 now accepts gzipped image files as well as raw files.
feature requests, questions, bug reports, or other comments.
Note that no reply will be sent, basically.
Answers to some common questions may appear on the website.
Optical Character Recognition, WeOCR, OCR Web, OCRWeb, Web OCR, WebOCR,
online OCR, free OCR
© 2005-2013 Hideaki Goto