Data Mapping Nuts and Bolts

As the amount of electronically stored information (ESI) grows within an organization, it becomes critical to document the location, accessibility and characteristics of that information for the purposes of risk management, litigation preparedness and ediscovery. The process by which organizations catalog their information is known as data mapping.


Data mapping is gaining increasing attention as organizations seek to comply with the amended Federal Rules of Civil Procedure (FRCP) while streamlining their ediscovery processes to decrease response times, lower costs, limit potential sanctions for underproduction of ESI and ultimately improve litigation success.

As organizations make the move from ad hoc paper and pencil processes and spreadsheets to differentiated, dedicated data mapping solutions, it is important to review the specific rationale for data mapping and the capabilities required to effectively create, maintain and use a data map.

Rationale for Data Mapping
The rationale for data mapping is derived from several rules in the FRCP. In particular, Rule 26(a)(1)(A) directly specifies the need for a data map. Additionally, data mapping eases the burden of fulfilling other FRCP rules by streamlining the ediscovery process. These rules include:

  1. Data map for delivery to opposing party: FRCP Rule 26(a)(1)(A) specifies that parties must provide each other with “a copy — or a description by category and location — of all documents, electronically stored information, and tangible things that the disclosing party has in its possession, custody, or control and may use to support its claims or defenses.” Additionally, information must be provided on “each individual likely to have discoverable information.”
  2. Meet & confer meeting preparation: FRCP Rule 26(f) specifies that opposing parties must meet within 99 days of the onset of litigation to discuss how ediscovery will be handled. This includes: (a) what ESI is to be covered; (b) how ESI is stored; (c) how the information will be produced; (d) accessibility of the information; and (e) issues related to privileged ESI.
  3. Not reasonably accessible argument support: FRCP Rule 26(b)(2) allows organizations to exclude ESI from ediscovery if it is “not reasonably accessible because of undue burden or cost.”
  4. Safe harbor and sanction avoidance: Rule 37(e) provides a safe harbor from sanctions for ESI that is “lost as a result of the routine, good-faith operation of an electronic information system.”

At a minimum, data mapping solutions should enable compliance with the above requirements. In the past, organizations have sought to meet these requirements manually using a paper and pen or spreadsheet approach. However, more automated approaches using purpose-built data mapping applications are beginning to gain significant traction, especially in large enterprises. The reason for the move to automated solutions is simple: they decrease the work involved and increase the accuracy of the data map. Automated, scalable, multi-user, workflow-enabled data mapping tools that integrate with end-to-end ediscovery solutions can save literally thousands of hours of effort.

Data Map Definition
Distilling the key requirements of data mapping provides a simple and easy to understand description:

A data map is a listing of the organization’s ESI by category, location, and custodian or steward, including how it is stored, its accessibility, and associated retention policies and procedures.

Historically, even large companies used spreadsheets to address these requirements. However, as litigation and organizations have grown more complex and the amount of electronic data has grown exponentially, the business case for dedicated software platforms that allow multiple authorized parties to interact with and update the data map is clear. Today’s litigations simply require richer, more dynamic, more detailed, and more accurate maps of enterprise data.

Data Map

ESI
- Category
- Repository/Location
- Custodian/Steward
- Storage Method
- Accessibility
- Retention Policies

Implementation
Data mapping implementation is a multi-disciplinary process involving legal, IT, and records management staff. For the best result, each of these stakeholders must be actively involved in the following high level steps for creating and maintaining the data map:

  1. Present and review legal requirements that must be met
  2. Review current IT data and repository maps; update as necessary
  3. Review retention and disposition schedules; update as necessary
  4. Interview custodians and stewards to establish real information flow and disposition (e.g. saving email data to local hard drives and CDs)
  5. Create a data map either manually using a spreadsheet or by using automated software
  6. Keep the data map evergreen by establishing a process by which changes in the underlying information are reported and incorporated into the data map.

The complex nature of data mapping within organizations means this process may take several months or over a year to complete. However, dedicated data mapping software can streamline and quicken this process, reducing manual effort, and enabling real-time updates.

Integration with Basic Systems and End-to-End Ediscovery
Data maps, by their nature, should be integrated with an organization’s basic systems and processes, whether manually or via systems integration. Basic integration enables organizations to keep the key data maintained by the data map up-to-date – minimizing the administrative effort needed on a case-by-case basis to meet FRCP requirements. Basic integrations include:

  • Repositories, Custodians, and Stewards: Directory servers, HR systems and IT / asset management systems provide the baseline data for repositories and the custodians and stewards. Repositories include all types of devices that hold information, including servers, data management systems, workstations, laptops, PDAs, cell phones, portable media, and 3rd-party hosted data. Descriptions of accessibility should be associated with the repositories.
  • Retention / Disposition Schedules: Retention and disposition schedules should be entered for each type of ESI. Additionally, it is important to understand how the information is ultimately used inside the organization, especially if it deviates from the stated retention and disposition plan. For example, if email is stored administratively on the email server or in a document management system, but users also copy it to their local hard drives outside of the organizations retention policy, then that email must be accounted for as well.

In addition to integration with basic systems, other integrations can enable a more complete end-to-end ediscovery solution, spanning the entire EDRM. In the most efficient scenario, these systems will all exist on a single platform, allowing for seamless integration and elimination of data transfer. Key capabilities that can be integrated with data mapping include:

  • Legal Hold Notice Management: Organizations can benefit from data mapping solutions that provide integrated distribution and tracking of preservation orders and electronic interviews. By providing both capabilities within a centralized end-to-end ediscovery solution, organizations can defensively demonstrate their compliance with established guidelines and procedures. This functionality allows the data map to be better used in the identification phase of the Electronic Discovery Reference Model (EDRM).
  • Unstructured Content Archiving and Legal Hold: As an inherently long-term data store, archival software should be integrated with data mapping and legal hold notice solutions to provide more efficient identification, collection, preservation, analysis, and review. Archives can provide an ideal basis for a single-platform end-to-end solution due to the scalability requirements for large-scale information management.
  • Early Case Assessment (ECA), Review: Organizations can use data mapping in an integrated solution to more quickly identify potentially relevant ESI for ECA and review. ECA is used to estimate the risk of prosecuting or defending a case early in the litigation process. By using a single-platform, end-to-end ediscovery solution with ECA, the overall ECA process can be accelerated, including the upstream collection and processing stages. By enabling more visibility earlier in the process, companies can make better legal decisions and lower their litigation costs.

integrated data map illustrationh

Summary
Advanced data mapping can greatly help larger organizations effectively operate in the increasingly complex information governance environment that resulted from the amended FRCP. With new data mapping software solutions, organizations have more choices to manage their information to meet both baseline and enhanced requirements. Moreover, by understanding the fundamental requirements driving data mapping, organizations can deploy a solution and develop processes that minimize costs on a case-by-case basis, decrease response times, limit potential sanctions for underproduction of ESI, and ultimately improve their litigation success over time.

John Wang is Product Manager for Unified Archive at ZL Technologies, Inc. He has over 15 years of experience in enterprise software, with particular expertise in ediscovery, archiving, compliance, information management, and information security. His ediscovery experience includes billion document information management and ediscovery projects. John has led technology innovation and industry best practices in the areas of ediscovery and search through his leadership role in the EDRM Project, information retrieval research in the TREC Legal Track, and research discussions in the ABA Electronic Discovery and Digital Evidence Committee.