Finding the Right Information. Right Now.

The role of metadata, taxonomies, classification, schemas and thesauri


When we need information, typically we need it NOW. We don’t have time to search through every folder and document. The paper-centric world required only that we know the general location of what we needed; as long as we knew the right file drawer, file cabinet, or file room, it was relatively easy to locate the right document. The general nature of a good filing system allowed us to segment our documents in a hierarchical manner by placing smaller folders within larger ones, which could then be arranged via topics, dates, or other criteria.

Despite the fact that organizations today often have most of their information stored in an electronic format, it’s often difficult to locate the exact documents needed. The repositories that contain our information may not be in the same building where we work or even in the same country. It can be difficult if not impossible to know the exact location of the information we’re looking for in an electronic environment. Therefore, it is imperative that we organize our information. Simply replicating the paper-centric file folder model, while helpful, is not enough to truly make finding documents easier.

Luckily there are three elements, that can simplify the process: metadata, taxonomies or classification schemas, and thesauri.

Metadata
Probably the best-known metadata standard is ISO 15836, Information and documentation – The Dublin Core Metadata Element Set. This standard establishes a set of elements that may be used to describe a document but it is not limited to documents. The Dublin Core Metadata can be used to describe virtually anything. Many years ago, AIIM produced a technical report, ANSI/AIIM TR40, Suggested Index Fields for Documents in Electronic Image (EIM) Environments. This technical report established index fields or metadata that may be used when indexing electronic images—or any other type of document. When documents are identified through the use of accurate metadata, they are more easily retrieved from the repositories.

While the Dublin Core Metadata Element Set is probably the most widely used metadata schema, there are many metadata schemas available. Additionally, an organization may determine that it is necessary to modify an existing metadata schema to meet its specific needs. An ISO standard under development, ISO/ CD 11864, Guidelines for the Creation of a Metadata Crosswalk, will define a method by which an online metadata crosswalk may be designed, built, and implemented.

Taxonomies
In addition to metadata, a taxonomy will improve the metadata by establishing parent-child relationships between the topics that are included in the schema. There are several standards in this area, including:

  • ISO 2788, Documentation Guidelines for the Establishment and Development of Monolingual Thesauri, which provides recommendations on establishing consistent practices in indexing documents. This standard focuses on monolingual thesauri.
  • ISO 5964, Guidelines for the Establishment and Development of Multilingual Thesauri, an extension of ISO 2788 that builds on its guidance to apply it to multilingual thesauri.

Thesauri
A thesauri improves the use of metadata and taxonomies by adding value to the terms through the relationships they identify.

It’s important to remember that the better the metadata are, and the more structured the taxonomy and thesauri are, the easier it will be to find the information you need. As you are determining these for your organization, make sure you consult the standards and get information owners and stakeholders involved in the process so that the metadata and taxonomy will suit their needs. Once you have identified your metadata and defined your taxonomy, make sure that you educate everyone in your organization on them and enforce their use.

Betsy Fanning is AIIM’s director of Standards and Member Services.