Different sponsors have different data management requirements, although most Federal agencies now require Data Management Plans that address how the proposed project will comply with the funder’s data management, dissemination and sharing policies. It is the researcher’s responsibility to determine if the sponsor has requirements and to understand these requirements. One requirement common to most funding bodies is a Data Management Plan (DMP). A DMP is a brief document, usually two pages or less, that outlines how the researcher will collect, organize, manage, store, secure, back up, preserve, and share research data.
The following outline is meant to provide general guidance for data management planning and is not specific to any particular funder, discipline, or type of data. Prospective researchers should always review the requirements of the funder or contact Research Support Services for advice.
- For the lastest updates related to open access, please reference the following: OSTP Public Access Policy Forum
- For more detailed requirements by agency, please reference the following: Implementation of Public Access Programs in Federal Agencies
Provide a general description of the data expected to be produced over the course of the project.
- Begin with a brief and general description of the data (including code or software, if appropriate) your project will produce. This should be a non-technical description that provides a general idea of what data will be generated throughout the research project.
- Indicate which data you will share and at what stage (raw, processed, reduced, or analyzed).
- Describe why the data you will share will be of interest to a broader community (what the impact will be) and how your plan maximizes that potential impact.
- List staff/organizational roles and their responsibilities for carrying out the data management plan
Content and Format
Describe data collection and processing plans, including data file and metadata formats or standards.
- Describe the collection and processing for data collected over the course of the project.
- Describes in general any descriptive or analytical statistics that will be run against the data for quality assurance, derivation, aggregation, etc.
- Identify the formats of data files created over the course of the project.
- Select file formats for sharing and archiving that maximize the potential for reuse and longevity, and describe the plans for conversion to those formats, if necessary.
- Identify metadata (documentation) standards to be used. If no applicable standards exist, indicate this in the data management plan and describe what supplementary documentation you will make available to make publicly shared data understandable and usable by others.
- Indicate who will create metadata, and at what stage of the research project metadata will be created.
Protection and Intellectual Property
Plans for ensuring the security of data and the protection of privacy, and policies related to intellectual property.
Researchers may have ethical or legal obligations to maintain confidentiality and to protect the privacy of research subjects, or may have other circumstances requiring secure data storage or restricted access to data, such as licensing restrictions that prohibit data sharing.
- Describe how the data itself will be managed (e.g. measures taken to anonymize data, disposition of data including personally identifiable information, etc.) to protect privacy.
- Describe how data will be stored, if secure storage and/or restricted access are required.
- Some funding agencies (including the NSF) recognize that legal and ethical requirements may preclude sharing of some kinds of data. If this is the case, explain in your data management plan the circumstances that prevent you from sharing data.
Copyright protection does not necessarily extend to data (under US copyright law, data are considered "facts" and therefore not copyrightable), but some standard licensing options (Creative Commons, Open Data Commons) exist . Most metadata standards accommodate rights or usage statements where conditions for reuse may be expressed. Note that some funding agencies (including the NSF) recognize that commercialization potential may delay or preclude data sharing, and exempts trade secrets and commercial information from the data sharing requirement.
- If not using standard licenses, describe the terms under which data may be used, and policies for the creation and distribution of derivative works. Explain how those terms will be communicated (for example, they will be included in the metadata, or in a stand-alone document).
Plans and infrastructure for storing and providing access to data.
- Describe which data will be shared. Indicate if there are any issues such as ethical or privacy related to data sharing.
- Describe the plan and resources (i.e. hardware, campus services, commercial services, or disciplinary data centers) for storing and providing access to data.
- Describe the mechanism and policies for access, including any potential restrictions, the rationale behind them, and applicable timeline.
- Indicate how the strategy for providing access will maximize the value of the data to the audiences of interest (a particular research community, the general public, etc.).
Preservation and Transfer of Responsibility
Plans for preserving data in a usable form.
- Identify the data that will be preserved beyond the end of the project, including selection rationale if only some data will be preserved.
- Describe the transformations that will be necessary to prepare data for preservation and/or data sharing (e.g., data cleaning, normalization, or removing personally-identifying information where appropriate).
- Indicate which archive, repository, central database, or data center you have identified as a place to deposit the data and describe plans for transfer of responsibility.
- Indicate when the data will be made available and how long the data be kept beyond the life of the project. Many grant funders suggest that the minimum data retention period for research data should be 3 years after conclusion of the grant award or 3 years after the data is released to the public (whichever is later).
Plans for covering data management costs.
- Check with the funding agency to determine where in the proposal to include costs related to data management.
- If appropriate, include any costs for data management services (data storage, etc.) from internal School or external sources.
- If appropriate, include any costs for managing data during the course of the project as well as after the project is complete (staff time, etc.).