Publish & Preserve

Data publication is the process of preparing and disseminating research findings to the scientific community.  Data preservation includes the actions and procedures required to keep datasets for a period of time and includes data archiving and/or submission to a trusted data repository.

Publishing and preserving your data allows increased access to your datasets and is recognized as good practice by researchers and institutions.  Datasets can be published as scholarly products, either linked to journal articles or as a standalone data object with scholarly value.  When published with a digital object identifier (DOI) or permanent URL, datasets can be easily discovered and cited in other researchers' work, increasing the value and impact of your research and ensuring research integrity.

There are times that it might not be possible to publish data, including privacy for restricted datasets or embargos on unpublished research; however, it is still important to preserve and archive your data for long-term access.

When preparing data for publication and preservation, it is important to take some things in consideration:

  • File formats for long-term access: The file format in which you save your data will influence the ability to share and re-use your data.  You will need to plan for both hardware and software obsolescence.  Save datasets in multiple open, documented formats, when possible, to ensure long-term preservation.
  • Metadata standards: Metadata is a standardized way of organizing data and provides context to data, including the who, what, where, when of data creation and methods of use, and provides the means for discovery, including a bibliographic citation, and reuse.
  • Copyright, privacy, and confidentiality:  It is important to establish ownership of the data before you preserve and publish. There are also ethical concerns surrounding data and it is important to maintain the confidentiality of research subjects.  CMU participates in the Collaborative IRB Training Initiative (CITI).  Make sure you have considered the implications of sharing data especially for research involving human subjects.
  • Publisher and funder requirements:  Some publishers and funders have specific requirements for long-term access to research data.  It is important to understand the requirements prior to publication.
  • Repositories:  There are subject-specific and institutional repositories available for the depositing and publishing of data.  Tools such as Databib can help you identify appropriate places to archive or publish.

Decide What to Preserve
Researchers should consider all elements of the scientific process in deciding what to preserve. DataONE presents these best practices on deciding what to preserve.

Copyright, Privacy, and Confidentiality
The DMPTool provides guidance on assessing these issues and can help in making decisions about sharing research data.

How and why researchers share data (and why they don’t)
2014 Wiley survey on researcher views of data sharing.