Data Management Plans

Overview

Many funding agencies and sponsors, public and private, now require a Data Management Plan covering the entire research data lifecycle as part of grant proposals. These plans typically state what data will be created and how, specify who is responsible for those data, and outline the plans for preservation and sharing, noting what is appropriate to share given the nature of the data and any restrictions that may need to be applied.

Questions to ask yourself about the project:

Managing data before research begins and throughout its lifecycle is helpful in ensuring its usability and long-term preservation and access. This checklist is intended to help you plan, evaluate data needs and develop a strong Data Management Plan. Not all questions will apply to every project.

  • What types of data will be produced? Will they be reproducible? What would happen if they were lost or became unusable later?
  • How much data will there be and at what growth rate? How often will the data change?
  • Who will use your data now, and later?
  • Who in your research group controls the data (PI, student, lab, Mines, funder)?
  • How long will the data be active?
  • What directory and file naming convention will be used?
  • What project and data identifiers will be assigned?
  • What file formats are to be used? Are they long-lived?
  • What is your data storage and backup strategy?
  • When will you publish the data (research) and where?
  • Who might be interested in your data in the future? Who will you share it with?
  • Who in your research group will be responsible for data management and archiving?
  • Have you identified a repository or archive in which to deposit your data?
  • Is there an ontology or other community standard for data sharing/integration?
  • How will you prepare the data (if necessary) for archiving?
  • Are there good project and data documentation?
  • How long should the data be retained ot archived (e.g. 3-5 years, 10-20 years, permanently)?
  • Are there tools or software needed to create/process/visualize the data?
  • Are there special privacy or security requirements (e.g. personal data, high-security data)?
  • Are there sharing requirements (e.g. funder data sharing policy)?
  • Are there other funder requirements (e.g. data management plan in proposal)?
  • Are there regulations, copyright, or other licensing concerns related to sharing the data?
  • Do the data need to be restricted or embargoed for intellectual property reasons?
  • Are there any reasons to not allow re-use?
Researcher Responsibilities

Overview

The researcher is the primary custodian and manager of research data. These duties are best described in a well-crafted data management plan.


Researcher Responsibilities

  • Develop a well-crafted Data Management Plan (DMP) that addresses data needs, data collection, management, integrity, confidentiality, retention, sharing, archiving and preservation
  • Use an orderly system of data organization and communicate the chosen system to all members of a research group and to the appropriate administrative personnel
  • Enact a robust backup plan to ensure data are not lost or destroyed
  • Provide metadata descriptions for discovery of the data
  • Deposit data to
    • an appropriate domain-specific subject repository
    • School repository
    • School computer servers
    • cloud services (if deemed appropriate and approved as part of the DMP)
    • another institution’s computer servers (if deemed appropriate and approved as part of the DMP)
  • Make the data available for access and re-use where appropriate and under appropriate safeguards based on parameters of the sponsor and Mines
  • Determine how long the data should be available
  • Do not hand over exclusive rights to reuse or publish research data to commercial publishers or agents without retaining the rights to make the data openly available for re-use, unless this is a condition of funding
  • See Table of Responsibilities Timeline
Institution Responsibilities

Overview

Both Mines and the researcher have responsibilities and rights concerning access to, use of, and maintenance of original research data. According to Steneck (2003) and based on regulation (OMB Circular A-110, Sec. 53; 42CFR, Part 50, Subpart A) research funding is awarded to research institutions and not individual investigators. As recipients of funds, institutions have responsibilities for overseeing budgets, regulatory compliance, and the management of data.

Mines policy grants ownership of research data and materials to the school. Except when precluded by specific terms, Mines is responsible for retaining research data in sufficient detail and for an adequate period of time to enable appropriate responses to questions about accuracy, authenticity, primacy and compliance with applicable laws and regulations. Mines can be held accountable for the integrity of the data even after the researchers have left.


Institution Responsibilities

The responsibilities of Mines are distributed across CCIT, Library, and Research Administration and Technology Transfer. Mine responsibilities include:

  • Accountability for the proper maintenance and availability of primary research data created or collected by Mines personnel and can be held accountable for the integrity of the data even after the researchers have left.
  • Managing to the highest standards throughout the research datalifecycle as part of the institution’s commitment to research excellence.
  • Providing training, support, advice and where appropriate guidelines and templates for research data management and Data Management Plans.
  • Providing mechanisms and services for storage, backup, deposit, and retention of research data assets in support of current and future access, during and after the completion of research projects.
  • See Table of Responsibilities Timeline

Research Council Role

In coordination with Research Support Services, Research Council will:

  • Define research data
  • Define policies and best practices
  • Work through issues related to research data management, such as the development metadata, identification of support and assistance needs, and the designation of responsibilities and roles

Steneck, N. H. (2003). ORI Introduction to the Responsible Conduct of Research. Department of Health and Human Services.

Sponsor Requirements

Overview

Different sponsors have different data management requirements, although most Federal agencies now require Data Management Plans that address how the proposed project will comply with the funder’s data management, dissemination and sharing policies. It is the researcher’s responsibility to determine if the sponsor has requirements and to understand these requirements.  One requirement common to most funding bodies is a Data Management Plan (DMP).  A DMP is a brief document, usually two pages or less, that outlines how the researcher will collect, organize, manage, store, secure, back up, preserve, and share research data.

The following outline is meant to provide general guidance for data management planning and is not specific to any particular funder, discipline, or type of data. Prospective researchers should always review the requirements of the funder or contact Research Support Services for advice.


Description

Provide a general description of the data expected to be produced over the course of the project.

  • Begin with a brief and general description of the data (including code or software, if appropriate) your project will produce. This should be a non-technical description that provides a general idea of what data will be generated throughout the research project.
  • Indicate which data you will share and at what stage (raw, processed, reduced, or analyzed).
  • Describe why the data you will share will be of interest to a broader community (what the impact will be) and how your plan maximizes that potential impact.
  • List staff/organizational roles and their responsibilities for carrying out the data management plan

Content and Format

Describe data collection and processing plans, including data file and metadata formats or standards.

  • Describe the collection and processing for data collected over the course of the project.
  • Describes in general any descriptive or analytical statistics that will be run against the data for quality assurance, derivation, aggregation, etc.
  • Identify the formats of data files created over the course of the project.
  • Select file formats for sharing and archiving that maximize the potential for reuse and longevity, and describe the plans for conversion to those formats, if necessary.
  • Identify metadata (documentation) standards to be used. If no applicable standards exist, indicate this in the data management plan and describe what supplementary documentation you will make available to make publicly shared data understandable and usable by others.
  • Indicate who will create metadata, and at what stage of the research project metadata will be created.

Protection and Intellectual Property

Plans for ensuring the security of data and the protection of privacy, and policies related to intellectual property.

Protection

Researchers may have ethical or legal obligations to maintain confidentiality and to protect the privacy of research subjects, or may have other circumstances requiring secure data storage or restricted access to data, such as licensing restrictions that prohibit data sharing.

  • Describe how the data itself will be managed (e.g. measures taken to anonymize data, disposition of data including personally identifiable information, etc.) to protect privacy.
  • Describe how data will be stored, if secure storage and/or restricted access are required.
  • Some funding agencies (including the NSF) recognize that legal and ethical requirements may preclude sharing of some kinds of data. If this is the case, explain in your data management plan the circumstances that prevent you from sharing data.

Intellectual Property

Copyright protection does not necessarily extend to data (under US copyright law, data are considered “facts” and therefore not copyrightable), but some standard licensing options (Creative Commons, Open Data Commons) exist . Most metadata standards accommodate rights or usage statements where conditions for reuse may be expressed. Note that some funding agencies (including the NSF) recognize that commercialization potential may delay or preclude data sharing, and exempts trade secrets and commercial information from the data sharing requirement.

  • Describe any standard licenses that will be applied to data, as well as any additional terms of use.
  • If not using standard licenses, describe the terms under which data may be used, and policies for the creation and distribution of derivative works. Explain how those terms will be communicated (for example, they will be included in the metadata, or in a stand-alone document).

Access

Plans and infrastructure for storing and providing access to data.

  • Describe which data will be shared. Indicate if there are any issues such as ethical or privacy related to data sharing.
  • Describe the plan and resources (i.e. hardware, campus services, commercial services, or disciplinary data centers) for storing and providing access to data.
  • Describe the mechanism and policies for access, including any potential restrictions, the rationale behind them, and applicable timeline.
  • Indicate how the strategy for providing access will maximize the value of the data to the audiences of interest (a particular research community, the general public, etc.).

Preservation and Transfer of Responsibility

Plans for preserving data in a usable form.

  • Identify the data that will be preserved beyond the end of the project, including selection rationale if only some data will be preserved.
  • Describe the transformations that will be necessary to prepare data for preservation and/or data sharing (e.g., data cleaning, normalization, or removing personally-identifying information where appropriate).
  • Indicate which archive, repository, central database, or data center you have identified as a place to deposit the data and describe plans for transfer of responsibility.
  • Indicate when the data will be made available and how long the data be kept beyond the life of the project. Many grant funders suggest that the minimum data retention period for research data should be 3 years after conclusion of the grant award or 3 years after the data is released to the public (whichever is later).

Costs

Plans for covering data management costs.

  • Check with the funding agency to determine where in the proposal to include costs related to data management.
  • If appropriate, include any costs for data management services (data storage, etc.) from internal School or external sources.
  • If appropriate, include any costs for managing data during the course of the project as well as after the project is complete (staff time, etc.).
Ownership and Intellectual Property

Overview

The general philosophy of Mines, is that research data should be made available for access and re-use where appropriate and under appropriate safeguards. Open access to research data from public funding should be easy, timely, and user friendly. However not all data can or should have unrestricted access. Availability of certain data may need to be restricted due to confidentiality, contractual or other issues. Note that federal programs may differ in their definitions of restricted, confidential, sensitive, and classified data. And data ownership and control issues can be complex in some situations.


Mines Policy

Mines policy grants ownership of research data and materials to the school. The researcher generally retains the rights and responsibilities over control and licensing of data and related materials.  Distribution is also at the discretion of the principal investigator, based on the Data Management Plan, if it exists, and barring any limits imposed by confidentiality agreements or funding agency restrictions.


Sharing Data Approriately

Preserving data in data centers or repositories which are managed by trusted entities for long-term access is a common and perhaps preferred way to share data. Other options are to share directly with colleagues via email, or collaborative networks. There are a number of important issues to consider when planning for data sharing such as:

• Does the research project have sufficient permissions necessary to disseminate the project data?
• Does the project need to provide access to all the data produced under a grant?
• Do the data include any private information, medical information, or other information with possible confidentiality concerns?
• Would the project like Attribution/Acknowledgment to be required or requested?
• Would the project like to receive information regarding the use of the project data by users?
• Would project like to provide permission for users to redistribute project data under certain conditions?

Exclusive rights to reuse or publish research data should not be handed over to commercial publishers or agents without retaining the rights to make the data openly available for re-use, unless this is a condition of funding.


Legal, Technical and Social Aspects of Open Data

When applying a license to your own data, you are encouraged to make it as open as appropriate, to enable others to use and build on your data. See the Open Data Handbook for more information on the legal, social and technical aspects of open data.