What are the essential components of data management?
- Plan for data management when you start your research project
- Organize your data (preferably according to a schema using established data and metadata standards)
- Document your data so that it can be understood in context later
- Store data with reuse and security in mind — keep original data files, use version control, and back up data in multiple locations
- Secure your data by following all cybersecurity protocols, based on your data's risk
- Validate your data, and assess for data quality
- Share your data
- Cite your data
Learn more in this Research Data Management guide.
What is research data management?
Research data management is the care and maintenance of data produced during research. It starts when your project starts, and continues through the end of the project, and sometimes extends beyond that. It has many components, but in summary, it involves planning, organizing, documenting, storing, securing, assessing, citing, and sharing your data alongside your research.
Why should you care about research data management?
Good research data management helps you:
- Find, analyze, and reuse your own data — even within your own team
- Communicate your data to others
- Stay publication-ready
- Share your data for reuse
- Contribute to the scientific record
- Stay compliant with institutional, funder, and publisher requirements
What are Yale's policies regarding data management?
Many of Yale's pertinent policies are summarized below:
Policy | Summary |
---|---|
Research Data and Materials Policy |
From Yale's Office of the Vice Provost for Research, this policy applies to all research data and materials generated with Yale resources, and covers data ownership, retention, transfer, sharing, and access policies.
Notable points include that (1) Yale owns the data and Yale researchers are responsible for managing it; (2) data and materials must be retained for at least three years after publication or final reporting; and (3) Yale researchers must make their data publicly available "to the extent feasible while minimizing harm." |
Data Classification Policy | From Yale's Information Technology Services (ITS), this policy explains data risk level definitions and how to choose secure data systems based on the data's risk level. For more assistance, read the policy guidelines and minimum security standards, and take the data classification questionnaire to determine your data's risk. |
Other Related Policies | Depending on the nature of your project, we also recommend you consult on relevant data policies with the following: Office of Sponsored Projects (OSP), Human Research Protection Program (HRPP - includes IRB and HIPAA policies as well), the University Privacy Office, and your funder (see below). |
What are funder policies regarding data management?
Below, basic information as it pertains to data management is summarized for several major funders. Most government agencies require data management plans, and data sharing upon project completion. Though we make an effort to keep this information updated, please consult information from your funder of choice as well before moving forward with an application.
Funding Organization | Data management plan required? | DMPTool template available? | Additional Information |
---|---|---|---|
U.S. National Institutes of Health (NIH) | Yes | Yes | The NIH Data Management and Sharing Policy was updated on January 25, 2023. Get more information about the 2023 policy from Yale's Office of Sponsored Projects (OSP). |
U.S. National Science Foundation | Yes | Yes | Requirements can vary depending on the scientific concentration. |
U.S. Department of Defense | Yes | Yes | |
U.S. Department of Energy | Yes | Yes | Requirements can vary across different offices, such as the Office of Science and Office of Energy. |
United Kingdom Research & Innovation (UKRI) Councils | Yes - for BBSRC. | No | Requirements differ across councils such as the Medical Research Council (MRC), Biotechnology and Biological Sciences Research Council (BBSRC), and Engineering and Physical Sciences Research Council (EPSRC). |
Find more information about research data sharing initiatives from a variety of public and private funders via SPARC.
U.S. Federal Agency Policy Changes Coming SoonAdditionally, you may want to review the White House's Office of Science and Technology Policy's (OSTP) recent 2022 memo on "Ensuring Free, Immediate, and Equitable Access to Federally Funded Research." Find all U.S. government open science announcements on this page.
Publisher PoliciesIncreasingly, publishers are also creating data policies. Check with your publisher and journal of choice to learn more, and review data policies at Nature and PLOS ONE to understand current expectations.
How do I find a data repository to share my data in?
Increasingly, funders and publishers want researchers to share their data in a data repository. This is because data repositories offer features that usually surpass what a traditional or enterprise data storage service — such as OneDrive or Box, for example — can offer. These features include data curation, creation of permanent identifiers (such as digital object identifiers, or DOIs) allowing for in bibliographic indexing, security, use metrics, auditing, controlled access, etc. Read more about the desirable characteristics of data repositories on this NIH Scientific Data Sharing page.
To find a suitable repository for your data, consider one of the following tools or sites (and scroll down this page to see more resources and tools):
- Data Repository Finder | National Library of Medicine (NLM)
- DataWorks! Help Desk - Finding a Repository | Federation of American Societies for Experimental Biology (FASEB)
- Dataset Catalog | National Library of Medicine (NLM)
Additionally, consider browsing related library resources, including how to find data (which often leads to data repositories) and more about where to share data.
Get help writing a data management plan
Sign up for the DMP email course
More and more funders require you to submit a data management (and sharing) plan with your grant proposal. Get step-by-step guidance on how to compile one in our new email course, “How to Write a Data Management Plan.” Sign up now!
In this six-part email course, you will explore the main components of a data management plan. By the end, and through a series of three action items, you’ll complete a draft data management plan, ready to submit to a funder or to put into use within your research team.
After sign-up, you can expect your first email to arrive within a few days, and the six main parts of the course over a period of about two weeks. The course elapses over, at most, three weeks.
View past Research Data Management workshops
Research Data Management Tools & Topics - Held on 2023/02/16 for Love Data Week
Fulfilling New Data Management & Sharing Expectations: The New NIH Policy and Beyond - Held on 2023-06-02 at the Janeway Society First Friday SeminarAdditional Resources
- Data management made simple | Nature
- Ten simple rules for the care and feeding of scientific data | PLoS Computational Biology
- Ten simple rules for maximizing the recommendations of the NIH data management and sharing plan | PLoS Computational Biology
- Ten simple rules for creating a good data management plan | PLoS Computational Biology
- The FAIR guiding principles for scientific data management | Scientific Data
- Data organization in spreadsheets | American Statistician
- Selecting a data repository | National Institutes of Health
- Generalist repository comparison chart | Zenodo
- DataWorks! Help Desk Knowledge Base | Federation of American Societies for Experimental Biology (FASEB)
- RDMkit | ELIXIR
Popular Data Management Tools
- DMPTool — Free for Yale users, this data management plan (DMP) generator has templates for most major funders, including NIH and NSF. DMPTool guides you through plan completion (e.g., with policy information, sample language, etc.), then allows for plan download in multiple formats. For those who choose to make their plan public, DMPTool lists these - this is great if you're looking for sample plans to review!
- StorageFinder — This in-house Yale tool helps you find and compare data storage options at and across Yale.
- FairSharing.org — This website allows you to search for relevant data and metadata standards and policies across many subject areas.
- re3data.org — This registry of data repositories allows you to search for places to deposit data (and find data to reuse)
- Dryad — This digital repository enables finding and depositing of data. Yale is an institutional member of this service, which means you can deposit data in Dryad for free.
- LabArchives — Licensed by Yale and free for those with a Yale NetID, this cloud-based electronic lab notebook (ELN) allows users to store and manage data in one place.
- REDCap (for Yale medical campus in general | for Yale-New Haven Hospital) — A secure web application for building and managing online surveys and databases.
- YSM Grant Library — Based within the Office of Physician-Scientist and Scientist Development, the Yale School of Medicine Grant Library serves as a model of successful grantsmanship, and currently holds 100+ grants. Access to the library is restricted to Yale faculty, trainees, and students.