Speed kills. OCR slower.
Going slow sometimes is fast. Why slowing down your recognition rate can lead to faster capture.
Want to data enter faster? OCR slower
Speed
kills. We all like speed. Fast cars. Fast lines. Fast computers. Fast . .
. w . . ., well, insert your own favorite. In my experience, that never-ended
hourglass icon as you wait and wait and wait some more for your computer to
execute a request is a huge source of computer rage. Despite the
phenomenal jumps in processing power, our expectations of what a computer can do
(FAST!) is a moving scale, and never satisfied.
In the world of optical character recognition (OCR) the typical user/buyer
will always ask how fast the technology is. How quickly will it read a
page, a character, etc. This is virtually an unanswerable question without
testing on their real documents and hardware. Like new buyers, current
owners of OCR are constantly trying to find ways to make it faster and more
accurate. This does not seem like an unreasonable goal given our need for
speed and quality.
What most users don’t realize is that (usually) making OCR faster makes it
less accurate and thus actually REDUCING the speed of the overall process.
Think of the OCR process as a room with 5 computer-generated data entry
clerks. Our computer people are, well computer people, and can do things
at the speed of electricity (and, of course, faster than real people. In
this room, each of the 5 data entry clerks specialize in a different aspect of
entering information, from paper or an image. One will be very good with
entering numbers, but make errors on alphabetical text. Another will be good at
spelling so if there is a question on a character then it will be able to look
at the whole word to determine what the character is, and so on. The
resulting text from this room of data entry clerks is a comparison amongst all
five’s individual results. Each character is checked and when the majority
agrees what a particular character is then the character will be part of the end
text result.
Now one major disadvantage of this room is that there is only ONE keyboard
(CPU) or some number of keyboards less than the number of people in the room, so
the clerks are forced to take turns. If you want to make the process
faster the key to speeding them up would be to remove a few clerks.
Because we only have two keyboards, let’s only have two clerks. Now we are
hauling, but our resulting accuracy took a hit of at least two thirds.
Because we removed the number expert, for example, we now are making far more
recognition errors where there are numbers and calling 1’s, I’s and 0’s,
O’s. Now our data entry clerks are faster, but the ending OCR result is
less accurate, has more errors, and because the two are less confident without
their peers they report back more uncertainty about their own results.
Because the OCR result from our computer people is less accurate we have to use
more real human time to check the result. What people often forget is no
matter how fast you want your computer people to be, they will always be faster
than a human with the same task. So now the time it takes to OCR, review,
and save a document is substantially more than it was.
This analogy is not too far off. As OCR technology has progressed over the
years it has not been a process of making old algorithms faster. It has been a
building block of existing algorithms, each new algorithm making the technology
that much more accurate, but slower. Additionally, OCR is an extremely
CPU-intensive process – each step of OCR will use 99 percent of any CPU or CPU
core or thread. The modern OCR engine is a culmination of at least 50
man-years of development and research, and the methods of the very first OCR
engine by Ray Kurzweil are still present in the latest and greatest.
Some companies are only concerned about quick and dirty searchable text from
full-page OCR. For these companies I would agree that with proper
scanning, the fastest engine, with the fastest settings will do the trick. The
above situation of course only applies to those companies wishing to verify
their results, or more typically have one pass done by humans, a second by OCR,
and a third review step where there are mismatch or questions. In both
these scenarios the data entry process is usually slowed by increased OCR
speed. Companies should consider putting more and not less computer
generated data entry experts in the process. Yes, this will slow down the
OCR, only to dramatically speed up the quality assurance step. I would
even argue to deploy a second pass of OCR on the document with different
settings of the same engine or software (the only proper way to do voting).
In most all the OCR solutions out there it is possible to disable portions of
the engine to make the process faster. It is typically also possible to
find settings that are disabled, enable them, and make the OCR even slower, but
more accurate. At a low volume you may not notice the impact as much, but
you would be surprised the number of service bureaus I’ve worked with processing
a hundred thousand plus pages a day, how slowing down their OCR engine decreased
the overall entry process.
So even though I realize how sexy speed is, you might just be hurting your
goal by speeding up your OCR engine, or buying an engine purely based on
speed. Take a step back and look at the whole picture.
Like this article? Agree? Disagree? Continue the discussion
at Information Zen.
Chris Riley ( chris.riley@livinganalytics.com)
is founder of
Living@nalyitcs ( www.livinganalytics.com) where he uses his
in-depth knowledge of data capture technologies to advise clients and
proselytize the value of these tools.
Chris recently was the feature speaker for our
webinar on March 5; Tips and Tricks to Help You Automate your Office Documents
(for Effective Data Capture). Listen at
www.aiim.org/webinararchive.