Digital Imaging Standard

Capture Community Wiki

Community Topic(s): Capture

Keywords: imaging, documentmanagement, capture

NOTE: The AIIM Standards Committee, C24, Document Imaging is seeking the assistance of the AIIM Capture Community to complete a section of a standard they are working on for digital imaging. Please feel free to edit the following.

2. System design considerations

As with any project, the place to start is with the system design. There are many elements that must be considered in order to achieve an effective imaging system design. While not every item discussed in this section needs to be considered, it is important to understand the elements of the system that may affect system performance and scalability.

Purposes of Digital Imaging

One of the main attributes and benefits of a digital imaging system is the ability to provide distributed access to document information by a worldwide audience. It also provides a way to preserve information contained in documents in other formats.

Reasons for converting paper documents to images include:

1. Establish redundancy – paper can decay, be lost or stolen, and is susceptible to fire, flood, insects, etc – and digital imaging enables a digital copy to be available.

2. Ensure document security – Losing documents may be costly to any organization as it may cause a loss of productivity or an embarrassing leak. Having information in a secured environment prevents ad-hoc deletions and can reduce risk. Documents are online and can require usernames and passwords for access.

3. Enable access to content anywhere – Providing secure image access over the web allows users to find the image they need whenever desired.

4. Increase the readership of the document – Easy access to images means that images or a link to images may be sent to any user all over the world. No longer are users tethered to a physical document for information needed for his or her job function.

5. Reduce copying and shipping costs – Copying paper documents and shipping those documents via mail or other carrier, are significant costs with any paper-intensive operation. Scanning reduces the need for a physical copy and if a physical copy is needed, it may be printed at the user location instead of being shipped to the user.

6. Reduce on and off site paper storage costs – By reducing the amount of paper in any organization, the storage costs will go down. There are costs associated with storing and/or preserving images. However, these costs will be reduced with sound records management and retention policies.

7. Provide the foundation of a sound records management solution – Having one digital image instead of multiple paper copies allows the records manager to establish the item of record and ensure that records are truly deleted when the retention policy has been met.

8. Reduce costs of sorting/routing, indexing, data entry, and redaction by providing the foundation for automatic data recognition solutions (OCR/ICR, barcode, MICR, CAR/LAR, etc) and auto-redaction.

End Users

As with any computer software solution, the end user must always be considered. Failure to discuss how the documents are used in the organization today may result in a failed implementation.

End users need to be confident in the system's ability to access documents. They also need to have confidence that the documents are properly secured. The trick, with the imaging solution, is to ensure that secure access to images is fast and easy – that the users may quickly find the desired scanned document – without a cumbersome process. Replacing paper or affecting any change may be difficult, but making sure end users are aware of the change and considering how images will affect his or her job should help achieve successful job performance.

End user design considerations include:

1. Computer Operating System – Can the user quickly log in and access data? Are there single sign on programs enabled that would allow the user credential to be cached so a user name and password do not have to be entered every time the system is accessed?

2. Application Integration -
a) User - Does the imaging solution work with other applications that users must access? Will the user need to have the imaging solution and other applications displayed as separate applications or is application integration possible? Will the imaging system provide the means for improved user productivity?
b) Automation - Does the imaging solution provide a secure API for access by "backroom" processing applications? Does the solution handle high volumes of queries and retrievals that can be generated by such applications? Can the solution provide extensible, structured storage for large amounts of metadata (ex. full page recognition, image transactional type, indices, data fields, etc.), or provide connections to other systems for such storage?

3. Monitor Size – Is the typical monitor size large enough for the user to read the data that exists on the image? What is the default image zoom that reduces the need for scrolling up or down or left or right on the image?

4. Printing – Will printing be allowed? If yes, what will be the mechanism to ensure the printed copy is secure? Will color printing be allowed? Is the ability to audit the print function required?

5. Software access – Are the searching capabilities easy to use? Must the users choose several options to get to an image? Are there default values and saved searches available? Does access to the content need to be audited?

6. External Devices – Will end users be able to save images to a USB drive or an external hard drive? Will the user be able to save a copy to a local hard drive?

7. Content/Metadata Editing - Will end users be able to manipulate the content or metadata? Does the system provide version control?

Considering the end user in the design of an digital imaging project will assist in user acceptance and help ensure the success of the project.

Audience

Separate from the end user experience is the audience that may need access to the images. Documents created or used by one department may need to be used by many different departments in the organization or externally.

1. Security – Who must see the document for his or her job? Does the user need view-only access? Should the user have the ability to annotate or add notes to an image? Should the user be able to modify the metadata?

2. Redaction – Are there parts of the image that may only be seen by a small group of users? Redacting or obscuring most or part of an image may be needed to protect personal identifiable information. This is particularly important for government organizations that must release data and for any organization that deals with private, restricted or confidential information.

3. Role – What is the role of each group of users? Content creator? Information users? Information manager? Security officer concerned with access to the information? Understanding the need for the data will ensure the appropriate information is available and also reduce risk by eliminating unnecessary document access.

Image Formats

When a piece of paper is scanned, the resulting electronic document may be created in many different formats. There are a wide variety of image formats and each type presents its own unique set of characteristics.

For example, TIFF and PDF can contain different embedded formats. They both allow multiple pages in a single file.

In addition, the format of the resulting image will affect how much storage is required and image quality as well as system performance. Some file compression types lose image quality while reducing storage requirements. Those that do not lose image quality are considered "lossless." Those that do lose image quality are "lossy." Refer to ISO 12033 for details on selecting the appropriate file format.

When OCR is apart of the process TIFF Group 4 is the ideal format for OCR accuracy. Post OCR format can be any digital or image format available in the OCR application, but for OCR results TIFF Group 4 is optimum.

Some standard and industry standard image file formats and file compression considerations include:

Output Formats:
1. TIFF (Tagged Image File Format) –
TIFF multibit
TIFF Group 4 single bit

2. BMP

3. LZW

4. PNG

5. JPEG2000

6. JPEG

7. GIF

8. JBIG/JBIG2

9. PDF,PDF/A, PDF/E

Replace list above with table


Metadata and IndexingThe ability to quickly locate and access the images must be considered in system design. Added metadata and indexing enhances the user's ability to quickly locate the desired image and will increase the chances of a successful imaging solution implementation.

Imaging Architecture

The hardware and network used in a digital imaging system include

Create a table with definition, use, required/optional, characteristics

1. Server Hardware

2. Web Server

3. Capture devices

4. User workstation

5. Storage and backup

6. Network

7. Printers

8. Remote Access

9. Handheld Devices

???????¶

1. – System back ups ensure the images will not be lost and hot sites or server clusters ensure that images will always be available. ????? (fit under Imaging Architecture)


The wiki text is available under the Creative Commons Attribution License agreement.