AIIM — The Enterprise Content Management Association

The source for solving your business content challenges.

SharePoint Micro Site

The Long Tail of Document Imaging and Its Impact on Business Scanning

How to avoid the potential hazards involved when poorly-trained or untrained staff attempt to “wing it” by using scanning features available on increasingly affordable scanners and multifunctional peripherals.

Mar 24, 2010


“The Long Tail” is a phrase coined by Chris Anderson, the editor-in-chief of Wired magazine, to describe the impact of Internet technology on retail businesses. Anderson first published his theory in Wired, and then greatly expanded upon the topic in a best-selling book entitled “The Long Tail: Why the Future of Business is Selling Less of More”.

Briefly, the idea is that whereas brick and mortar retailers can only afford to stock popular items, e-retailers are able to sell an almost infinite catalog of goods, since their catalogs (and often inventory) exist only online, and products can be purchased as needed and sent to buyers without having to stock massive and costly warehouses. After interviewing a number of Internet retailers of music and books, Anderson was surprised to discover that almost 100 percent of the items in their massive catalogs had at least one purchase each year. More importantly, they could expect to make a profit from every item.

Organizational Document Patterns 
A similar “long-tail” effect can be seen in business and institutional documents. In most organizations, only a small number of documents are created or received in high volumes; typically they are associated with formal business processes. For example, in an insurance company, applications, renewal notices and claims each arrive in large quantities. In most cases, these large-volume documents have already been automated with imaging, recognition and workflow, as the savings on handling and storage justify the investments required.

However, many other types of documents do not receive the same attention, due to their lower volume, distributed delivery or creation points, or ad-hoc handling. These documents still need to be processed and kept, and represent a significant knowledge base. But there are challenges – proper processing and filing may require significant special knowledge, and ensuring business continuity is expensive.

When attempts are made to bring these documents into the existing Enterprise Content Management system, it often tends to be through efforts to merge them into the existing imaging infrastructure. Unfortunately, production-imaging products are not easy for general staff to use on a casual basis, and production scanning staff may not have the knowledge to properly index such a varied collection. Even when simplified imaging interfaces are provided, end users resist having to deal with the extra steps needed to process and file the result.

As a result, the “casual” scanning of documents such as correspondence, complaints, and similar work-products end up being a neglected portion of the document and records management operation. The “long tail” components of an organization’s knowledge therefore can be under-managed. If information arrives or is generated in an electronic format, it is comparatively easy to capture, although electronic filing and processing may still present challenges. However, it if arrives on paper, then the additional task of digitizing it presents itself, and here we find danger.

Trends in Document-Imaging Technology 
Like so many other digital technologies, the cost of desktop scanners and scanning software has been decreasing, while at the same time the capabilities and features of production products have been added to low-end products. Many scanners with color and image enhancement, for instance, can be found for under $1,000, and most come with a basic imaging application in the box. While the imaging software provided is often a “lite” version, for a small fee one can upgrade to a “pro” version with enhanced capabilities. This makes it easy to generate TIFF and PDF files and deliver the results to network shares, databases and ECM solutions like Microsoft SharePoint.

In parallel with this shift in the scanner market, multifunctional peripherals (MFPs) have also become popular office tools, due to their space- and cost-savings. Combining the features of scanners, copiers, fax machines, and laser printers, they are now available in a range of sizes. Most of the digital copiers shipping today can be enabled with one or more of these additional options. MFP scanning can be integrated in a number of ways, including a user-interface at the control panel. Increasingly, manufacturers of these devices and their software are introducing image enhancement and enterprise data management system (EDMS) software and indexing features that we are familiar with in dedicated scanning solutions.

A final trend we will consider is the move from computer-connected interfaces to network-connected interfaces for image-capture equipment. Access to the digitizing capabilities of a single device can be granted to many and the digitized image can be moved electronically anywhere in the connected world without manual intervention. The tools for document imaging have, as a result of these trends, been made available to general office workers. In short, the ability to digitize paper has entered the long tail’s simple and widely-available production mode. These trends are enabling staff to easily acquire the tools to digitize their paper, often without going through IT.

Organizational Results - With Potentially Catastrophic Consequences
In his book, Chris Anderson describes how desktop-publishing tools and Web distribution have changed book publishing. Anyone can write and format a book using word-processing and book-formatting tools. They can be sold through Web catalogs and marketed via online comments and word-of-mouth. Small-run books can be delivered using print-on-demand services, such as electronic documents or e-books.

All of this technology democratizes access to the audience of readers and provides a forum for all viewpoints and arcane knowledge. On the downside, creativity is not evenly distributed; not everyone is capable of creating the next masterpiece or best-seller. So the quality of all these new choices varies. The effect on the online marketplace is a clutter of entries of differing value, which is why we depend so much on expert advice and search-engine rankings. If, in spite of these aids, we make a poor choice, there still may be a cost.

Likewise, we can be assured that a general user base of employees empowered by inexpensive tools will start scanning. We can also safely assume that the growth of inexpensive and powerful document imaging technology will follow the same pattern as happened in the past with PCs, Web browsers and websites, and cell phones. But long-tail changes introduce many problems to an organization – even some that are potentially catastrophic. Think about the many perils of legal and governmental compliance alone, for instance, and you can start to lose some real sleep.

Critical Issues for Information Managers
What problems does this world-shift bring to managers in areas such as IT, Records, Compliance, and Legal within the organization?

  1. Quality. Centralized-scanning and data-entry staff can easily be trained and their output reviewed. Their systems can be engineered to prevent mistakes. But when anyone can scan, the results will vary. Consider that casual scanning and indexing is inevitably going to be done by employees who: (a) have little or no training, or (b) do it as a side task Accordingly, their work is going to be subject to variations and potential errors, and an unreadable record is useless and a misfiled record is as good as lost. 
  2. Support. Technology staff will find the proliferation of equipment and software, often from a range of vendors, a true challenge to support. Regrettably, they are not likely to be able to avoid the task, as these tools become key components of business processes. With so many different tools, the technical knowledge to address problems is impossible to establish and maintain. Testing compatibility in preparation for significant software introductions and upgrades can delay important changes for many months.
  3. Cost Control. I have observed in my own practice that personal scanners quickly become vanity items, just as personal printers have in the past. They may no longer be expensive, but there still are costs. A personal scanner with supporting software currently averages about $1,000, plus annual maintenance costs of about 20 percent. These figures can usually be approved by lower-level managers, and spread across an organization they can add up to a significant expenditure. Yet, by their nature, these devices tend to be used no more than 30 minutes a day (if that much). Even with that little usage, they require regular cleaning and other maintenance – to say nothing of additional energy costs.
  4. Information Control. The information being scanned gets stored based upon individual preferences (just like tools in Microsoft Office and other systems, and wherever emails are used). Some scanners may tie the images into internal software systems, but much of it goes to local or network directories, with individualized naming conventions and folder hierarchies. Locating the document image later can be difficult, even if the person who stored it is still with the organization. We ought not to forget the care and handling of the paper originals, either. They may still need to be kept after digitization; certainly they better be kept until the digitized version is reviewed. Record-keeping procedures, therefore, need to be amended to specify which version is the official record in conformance with legal and regulatory constraints, and ensure that each is disposed of, if and when the appropriate time is reached.

Methods for Controlling/Influencing Uncontrolled Scanning 
Casual scanning is not easily opposed or forbidden. Users can find the tools in consumer electronics stores and online stores. There are, however, some approaches that can reduce the problems that it introduces. The only possible response is to place responsibility for quality on the people who own the documents and are digitizing them. But some expert guidance will help them.

  1. Hardware and Software Standards. Organizations should consider imposing standards regarding what scanning hardware and software may be used. This simplifies support and allows purchasing authorities to obtain quantity discounts.
  2. Data and Image-Quality Standards. Guide your users by establishing quality standards for images, including density, readability and acceptable formats.
  3. Pre-Use Capture. One way to ensure that imaged documents are correct is to engineer capture processes so that digitization occurs before the document is utilized. By forcing staff to use the digital image, you can be assured that it was examined by a knowledgeable party. In cases where that can’t be done, the group doing the scanning must be required to review everything.
  4. Mandated Use of Document Management. If there is an EDMS, try to mandate that all scanned documents be placed there. With new electronic tools such as SharePoint that use WebDAV or have other simplified integration, adding a document requires less effort. That effort can be reduced further by using capture data to automate the e-filing.

The Bottom Line
All managers should recognize that simpler, less-costly technology will increase the flow of digitized documents within the organization. This can improve your document and record keeping, but without a careful eye, it can add clutter, disorganization, unnecessary costs, and result in the very real possibility of lost data that could even be critical to legal discovery and other strategic needs.

Bernard Chester, CDIA+, edp, ICP, is a vendor-neutral ECM consultant. He has been helping people succeed with ECM for the past 20 years and is an active contributor to AIIM and ARMA standards. He can be reached at BChester@IMERGEConsult.com.

Preferred Solution Providers