Speed kills. OCR slower.

Going slow sometimes is fast. Why slowing down your recognition rate can lead to faster capture.

 

Want to data enter faster?  OCR slower
Speed kills. We all like speed.  Fast cars. Fast lines. Fast computers. Fast . . . w . . ., well, insert your own favorite. In my experience, that never-ended hourglass icon as you wait and wait and wait some more for your computer to execute a request is a huge source of computer rage.  Despite the phenomenal jumps in processing power, our expectations of what a computer can do (FAST!) is a moving scale, and never satisfied.

In the world of optical character recognition (OCR) the typical user/buyer will always ask how fast the technology is.  How quickly will it read a page, a character, etc.  This is virtually an unanswerable question without testing on their real documents and hardware.  Like new buyers, current owners of OCR are constantly trying to find ways to make it faster and more accurate.  This does not seem like an unreasonable goal given our need for speed and quality. 

What most users don’t realize is that (usually) making OCR faster makes it less accurate and thus actually REDUCING the speed of the overall process.

Think of the OCR process as a room with 5 computer-generated data entry clerks.  Our computer people are, well computer people, and can do things at the speed of electricity (and, of course, faster than real people.  In this room, each of the 5 data entry clerks specialize in a different aspect of entering information, from paper or an image.  One will be very good with entering numbers, but make errors on alphabetical text. Another will be good at spelling so if there is a question on a character then it will be able to look at the whole word to determine what the character is, and so on.  The resulting text from this room of data entry clerks is a comparison amongst all five’s individual results. Each character is checked and when the majority agrees what a particular character is then the character will be part of the end text result. 

Now one major disadvantage of this room is that there is only ONE keyboard (CPU) or some number of keyboards less than the number of people in the room, so the clerks are forced to take turns.  If you want to make the process faster the key to speeding them up would be to remove a few clerks.  Because we only have two keyboards, let’s only have two clerks.  Now we are hauling, but our resulting accuracy took a hit of at least two thirds.  Because we removed the number expert, for example, we now are making far more recognition errors where there are numbers and calling 1’s, I’s and 0’s, O’s.  Now our data entry clerks are faster, but the ending OCR result is less accurate, has more errors, and because the two are less confident without their peers they report back more uncertainty about their own results.  Because the OCR result from our computer people is less accurate we have to use more real human time to check the result.  What people often forget is no matter how fast you want your computer people to be, they will always be faster than a human with the same task.  So now the time it takes to OCR, review, and save a document is substantially more than it was.

This analogy is not too far off. As OCR technology has progressed over the years it has not been a process of making old algorithms faster. It has been a building block of existing algorithms, each new algorithm making the technology that much more accurate, but slower.  Additionally, OCR is an extremely CPU-intensive process – each step of OCR will use 99 percent of any CPU or CPU core or thread.  The modern OCR engine is a culmination of at least 50 man-years of development and research, and the methods of the very first OCR engine by Ray Kurzweil are still present in the latest and greatest.

Some companies are only concerned about quick and dirty searchable text from full-page OCR.  For these companies I would agree that with proper scanning, the fastest engine, with the fastest settings will do the trick. The above situation of course only applies to those companies wishing to verify their results, or more typically have one pass done by humans, a second by OCR, and a third review step where there are mismatch or questions.  In both these scenarios the data entry process is usually slowed by increased OCR speed.  Companies should consider putting more and not less computer generated data entry experts in the process.  Yes, this will slow down the OCR, only to dramatically speed up the quality assurance step.  I would even argue to deploy a second pass of OCR on the document with different settings of the same engine or software (the only proper way to do voting).

In most all the OCR solutions out there it is possible to disable portions of the engine to make the process faster.  It is typically also possible to find settings that are disabled, enable them, and make the OCR even slower, but more accurate.  At a low volume you may not notice the impact as much, but you would be surprised the number of service bureaus I’ve worked with processing a hundred thousand plus pages a day, how slowing down their OCR engine decreased the overall entry process.

So even though I realize how sexy speed is, you might just be hurting your goal by speeding up your OCR engine, or buying an engine purely based on speed.  Take a step back and look at the whole picture.

Like this article? Agree? Disagree? Continue the discussion at Information Zen.


Chris Riley ( chris.riley@livinganalytics.com) is founder of Living@nalyitcs ( www.livinganalytics.com) where he uses his in-depth knowledge of data capture technologies to advise clients and proselytize the value of these tools.

Chris recently was the feature speaker for our webinar on March 5; Tips and Tricks to Help You Automate your Office Documents (for Effective Data Capture). Listen at www.aiim.org/webinararchive.