Metadata best practices

Metadata helps document your data, allowing users to understand what the data is, how it was created, who created it, and how it may be used. Metadata standards differ among research domains and data types. At the very least, best practices recommend a minimal set of metadata with your datasets.

Why create metadata?

Metadata provides a description of your datasets, contact information for responsible parties, and information about terms of use. All of these pieces of information are typically provided by data producers when sharing data with collaborators and metadata can help formalize this information and link it to your datasets. Metadata may also be required by your funding agency when preparing data for sharing or preservation. It is considered best practice to at least provide a minimal set of metadata for your data products. 

What type of metadata?

Metadata standards have proliferated over the past few decades alongside the proliferation of digital datasets. The number of standards for metadata is now so large and varied that it is not possible to adequately answer the question, "What Type of Metadata?" without understanding the specifics of the data objects in question, the parameters for sharing, and the venue from which the data will be shared. Many research domains, research data repositories, and funding agencies have specific requirements for metadata and data documentation. That said, basic elements that most researchers consider "must-haves" for data documentation have been formalized as elements of most metadata standards.

Dublin Core

The DMSG recommends, at a minimum, using the Dublin Core Metadata Element Set to create metadata for your datasets and data collections. In addition, the DMSG would be happy to work with individual researchers or research groups to identify metadata schema that may meet more specific documentation needs of particular data types.

The DMSG recommends researchers provide a minimum set of metadata for their datasets such as the Dublin Core Metadata Element Set (DCMES) - a simple set of 15 pieces of information that help identify and contextualize information objects. Elements of the DCMES are:

  1. Title - the title of the data object
  2. Creator - the person or entity responsible for creating the data object
  3. Subject - subject terms or keywords that describe the data object
  4. Description - a brief description, or abstract, of the data object
  5. Publisher - the entity responsible for making the data object available
  6. Contributor - a person or entity who contributed to the creation of the data object
  7. Date - data of creation, publication, or revision of the data object
  8. Type - the type of object. For data this would typically be "dataset"
  9. Format - a description of the format or file type(s) of the data object
  10. Identifier - a permanent identifier used to locate and identify the data object
  11. Language - the language(s) used within the data object (if applicable)
  12. Source - a relational element describing the lineage of the data object
  13. Relation - a relational element describing the relationship of this data object to other objects, collections, or entities
  14. Coverage - describes the spatial and temporal context of the data object
  15. Rights - describes any rights, restrictions, or terms of use

While not all of these metadata elements are necessary, the more information that can be provided to potential data users, the better.   Additional documentation about the Dublin Core Metadata Element Set can be found below:

  • ANSI/NISO Z39.85 2007 - this is the ANSI/NISO standard that describes the DCMES and is a good starting point for a general description of the metadata elements and their use
  • Dublin Core Metadata Initiative - this is the organization that developed, maintains, and updates the DCMES among other activities