Operation OCR Rebirth
The OCR engines of today are getting somewhat long in the tooth. Though they do an outstanding job today, perhaps future engines will be even better.
By Chris Riley
To-date I've talked about optimizing OCR technologies. I've explained
different OCR techniques and have lectured purchasers about how to buy OCR. But,
I've yet to do one of the favorite jobs of any technologist: predict the future.
What does the future hold for the optical character and data capture
technologies?
Stagnation IS progress
If you asked a computer science professor 12 years ago how OCR works their
answer with the exception of document analysis would still be 100% applicable
today: Innovative OCR development is at a halt. I need to be absolutely precise
to what I'm referring. There has been (and is) a lot of development in how OCR
is used; especially in semi-automated data capture. But in terms of the core
processes of converting image to text, there is very little progress. There are
a couple reasons for this:
1. There aren't that many players. Arguably, there are only 4 general
purpose commercial OCR engines. Thus the knowledge gene pool is small.
2. There has been litigation threatening the core of OCR processes that
would forever change how current engines function.
These four engines make up the vast majority of all the OCR boxed products
available. There are a handful of specialized engines, but their use is rare. Of
these four engines, their core capabilities were developed between 1995 to 1998
and largely untouched either because the original development teams are no
longer in place or the code was acquired through acquisition where much of the
knowledge was lost. What this means is that for these four engines improvement
can only be done on a higher level in areas of document analysis, image
enhancement, and dictionaries ONLY.
That was the history lesson, now here is the exciting news. Since these
engine cores were created, programming techniques have drastically improved,
paving the way for new approaches and the ability to start from scratch at less
than the 50 man-year cost it took to get where we are today. I believe things
will remain largely unchanged over the next five years and only moderate
enhancements in accuracy for particular use cases. In five to seven years—hold
on—amazing things will come to excite even the least technical of us. These four
cores will likely die, but their death will give birth to something even
better.
The Technology World Is Now Ready
Adoption of OCR has been very interesting. Either it's a “members only”
technology where you could only buy it if you were in the enterprise content
management space and had prior knowledge of the technology by some secret member
initiation, OR the technology was bundled with your scanner and you use it, but
you don't really know what “IT” is. This gap has to be bridged at some point.
The way that is going to happen is greater awareness and better market
education. Because the bulk of these technologies are European, the market
education has been tailored only for the technical crowd. European development
companies are highly product-development driven and tend not to have a great
grasp of market education. When their U.S. sales and marketing counterparts get
involved they are left in the dark as to the “why” and left only to tricky
marketing and classic non-technical sales techniques.
Both the US technology community, computer hardware, and the market are
closer and closer to being able to embrace this technology and explain it to the
typical user. What happens naturally as technologies mature the more tenured
bits become a commodity and a standard in any related product, the more
technical bits (semi-automated data capture) stay a B2B product but become
easier to use and fewer requirement are placed on technical expertise. Both
point to a dramatic decrease in price and increase in adoption rate.
In 1995, OCR development was already pushing hardware to the limits, thus
limiting developers’ ability to expand, but this is no longer the case.
Development firms can consider more processing-intense approaches to OCR that
can be implemented faster and ultimately be more accurate. I envision that some
fresh blood arrives in the OCR technology market to create a newer, more
innovative, adaptable, and technically advanced approach to OCR and
semi-automated data capture. There is technology that seven years ago was the
material of computer science students only. Today it could be deployed to have
faster, more accurate, and most importantly, adaptable OCR engines that would
influence all processes that use them.
Stubborn Computer Scientist
While our technical elite tend to be thought of as innovative and creative,
they all suffer from the same human problem of stubbornness. Without the
induction of new scientist interested in OCR the same approach has been deployed
and considered matter of fact. A new generation of developers is out there with
perhaps more elegant approaches that will benefit all of us who want to convert
paper to text. Hints of this have been taking place already with some new
startups, and not in the typical spots, innovative development has been going on
in Michigan, California, and Toronto. These small, truly innovative teams may
have the modern OCR engine killer of what we know today.
It's Coming; I promise
I don't want to wait either. The technologies currently available are
impressive and worth the effort, but I can't wait for the day that the OCR
grandpas step aside and new ways of OCRing come about. I predict at the same
time, at this development, larger technology giants will join. Thus, adding to
the knowledge pool and rapid development of the new and cool; hopefully not
creating a new acronym in their path.
Chris Riley (chris.riley@livinganalytics.com)
is founder of Living@nalyitcs (www.livinganalytics.com) where he uses
his in-depth knowledge of data capture technologies to advise clients and
proselytize the value of these tools.
Chris recently was the feature speaker for our webinar on March 5; Tips and
Tricks to Help You Automate your Office Documents (for Effective Data Capture).
Listen at www.aiim.org/webinararchive.