HomeResearch Data

Research Data

aerial view of black hands typing on a macbook with a book titled python nearby

Supporting data users across the Yale medical campus through instruction, collaboration, and consultation


Get in Touch

photo of Kaitlin Throgmorton

Kaitlin Throgmorton, MLIS
Data Librarian for the Health Sciences

Frequently Asked Questions

What are research data?

Definitions abound, but one way to define research data is raw outputs resulting from the process of research and inquiry, such as:

  • Experimental data
  • Survey data
  • Mined text
  • Images
  • Qualitative data, such as interview transcripts, diaries, field observations, etc.


  • Australian National Data Service. (2017). What is research data. https://ardc.edu.au/
  • Borgman, C.L. (2015). Big data, little data, no data: Scholarship in the networked world. MIT Press.
Where can I find data to reuse for an assignment, project, or research study?

Data can be found in many places, such as in library databases (e.g., PubMed), as supplementary materials in journal articles, and in data repositories. Use this guide to learn more.

What’s research data management?

Research data management is planning for how you’ll collect, use, process, analyze, and disseminate data throughout the lifecycle of your research project. It starts as soon as your project starts, and involves elements such as data documentation, data validation and quality assurance, data security and storage, and data ethics and reuse. Effective research data management results in the following:

  • Compliance with institutional and funder requirements
  • Improved project efficiency
  • Increased collaboration
  • Publication readiness
  • Adherence to best practice standards, such as FAIR
  • Contribution to the scientific record

To learn more about research data management, request an instruction session for your class, lab, or research group on this topic by scheduling a consultation with the data librarian for the health sciences.

What are Yale’s expectations for how I manage my data?

Review Yale’s Research Data and Materials Policy (6001), and ensure you know your data’s classification status — you can take this questionnaire to find out. Whether your data is low, medium, or high risk determines where you can store it, who can use it (and how), and how you should manage it.

Additionally, if applicable, review other standards and requirements, such as HIPAA and IRB policies.

I want to learn skills to better work with my data. Where should I start?

Many people learn programming languages such as Python and R to work with data. Others use software, such as Excel, other spreadsheet applications (e.g., LibreOffice Calc, Google Sheets), SPSS, SAS, Tableau, and many others.

While many resources exist (including classes at the medical library, on LinkedIn Learning, and at other places on campus, such as the StatLab, Yale Center for Biomedical Data Science, and Yale Center for Research Computing), a few free starter resources of note include:

Where should I store my data?

Use this handy storage finder tool to determine the best Yale-provided data storage solution for your needs.

Upcoming Training