Realize a Rapid Return on Investment by Eliminating "Edge Content"
It's expensive to keep all that content on shared drives; here's what to do about it.
Despite the effectiveness of cont ent/document/records management solutions,
only five to 10 percent of unstructured files are actually stored in managed
repositories. The remainder represents a huge portion of your organization’s
knowledge. The few proactive organizations that have tried to apply records
management best practices to shared drives find the effort exhausting. Shutting
down a shared drive is almost as unachievable a goal as creating the paperless
office. Even though the cost of storage continually drops, the cost of managing
this information is growing, particularly for IT staff.
Question: Why should you clean up and migrate off your
shared drives onto a content repository?
Answer:
there’s a lot of useless stuff out there, and it costs money to keep it there.
Let’s look at the three kinds of content on most shared drives.
1. Roughly a quarter to a third of storage space is eTrash—stuff
that has no value to the organization and consists of:
- Temporary files created from crashed applications,
process logs, and automatic backup functions.
- Applications and install files that no longer work.
- Duplicates – but remember that in most cases, one of
the copies should be kept and you will need to find a way to identify which
one.
- Backup – a duplicate or copy of something that is
important.
- Zip files – users zip both files and folders to make
backups or to compress files for emailing. Both of these activities often
result in duplicates because originals are not deleted.
- Expired records that have reached desired/required retention but have not
been purged. Easier said than done. Many organizations err in keeping
everything forever because they are afraid of destroying files that should be
kept longer. The law doesn’t expect perfection when it comes to records
retention, only reasonable effort and good intentions.
2. Another 25 to 33 percent of storage consists of blocked files that are
probably valuable but that should not be migrated. This information may need to
be preserved, but not in a content repository.
- Grouped documents – nested or grouped folders are sets
of related documents that are linked by their relative position in a folder
hierarchy. They rely on a file being in the right location in the file system:
HTML presentations, compound documents, CAD files, databases, or applications.
It is not that they can’t exist in a content repository; they just cannot be
migrated as they are to a repository since all the links will no longer
work.
- Saved programs – users store applications, install
files, and system files on shared drives for a number of reasons.
Applications, however, need all of their code and referenced files available
in an active state in order to work. A content repository is not that kind of
environment.
- Stored databases – people also create individual or workgroup databases
that also need to be in a live state and have access to all referenced table
and resources in order to work.
Archived emails – email inboxes get too big, so staff archives messages to
laptops. Laptops get lost or stolen so users archive on shared network drives as
personal folders, commonly called “PST files” for their “.pst” suffix. These
files can become huge. At a recent client, they constituted one percent of files
by type, but they hogged 80 percent of storage space. All that important
information was “too important” to delete and is now sitting in a PST file,
where it is not easily retrieved, shared, or reused. The PST files themselves
should not be migrated—just the content within.
- Organized media – iPod libraries and company picnic photos are all
worth keeping. However, because of their file size and lack of corporate
value, they should not be migrated or stored on servers that are backed up and
restored in an emergency.
3. Inventory Resistant “stuff.” This is good, valuable, discoverable content
which hasn’t been captured yet because it is hard to categorize and migrate. In
this category, you find:
- Risky content – confidential information such as
employee or customer social security numbers or credit cards.
- Vital content – content needed to run the
organization, but intermingled with less important content.
- Legal hold – content that should be preserved for litigation purposes.
What are the costs?
Here are some costs for keeping
files on your shared drives.
- Staff productivity costs – information workers waste
3.5 hours each week on searches that don’t turn up the right information. This
is partly due to poorly indexed and tagged documents, and partly due to the
time required to search through up to three times as much stuff as they need
to. Doing that with 1,000 staff at $60K per year = $5.25M.
- Producing information for e-discovery – companies
spend roughly $200 per GB for each e-discovery culling case. One case per day
and 1 terabyte of storage = $73M.
- Network storage costs are a logical and necessary expense, but small in
comparison. Companies spend roughly $10 to $30 per gigabyte per month to save,
backup, and restore content. 1 terabyte = $184,320
Recommendations: create an information management plan
Organizations need to establish an information management plan that
not only incorporates business goals and current technologies but also
identifies the unstructured information stored on shared drives:
- The first step is to know what you have.
- Don’t make it your goal to migrate all your content to
a repository.
- Provide a means for staff to continue to store all
types of information, but not in the repository.
- Consider records management, disaster recovery
requirements, legal preservation, and IT concerns when cleaning up and
migrating data.
- Before you delete or migrate anything, make sure you have a good
understanding of your legal preservation requirements. You don’t want to
delete information without appropriate protections in place.
Brian
Tuemmler is a director of The Gimmal
Group.