Data Resources: Finding and Using Data

Information on finding, using, citing, and managing data

Where to look for data

The links for the Social Sciences, Sciences, and Humanities in the center of the page are just a few common starting points. To find research data in a specific field or subfield, search in the Registry of Research Data Repositories (re3data).

If you're looking more generally for data sets that you can jump in and start analyzing or visualizing, has a collection of data sets from various institutions and textbooks.

Using data ethically

When you plan to collect data from human (or some animal) subjects for a study or an experiment, you must follow the guidelines provided by your Institutional Review Board (IRB). Whitman's basic guidelines can be found here. These are meant to protect the welfare, rights, and privacy of the subjects.You may also need to check with an IRB to use secondary data that are publicly available.

When you use secondary data, you must comply with the terms of use of the secondary data providers. This includes not attempting to find personally identifiable information from the dataset and reporting any discoveries of personally identifiable information, as well as proper citation.

What does "curated data" mean?

Some repositories curate their data -- others do not. What's the difference? Curated data have generally been checked by the repository to make sure that they conform to certain standards for presenting and describing the data, and for preserving data for future use. This helps to ensure that new users can understand what the data represent, and makes these data easier to share and reuse in the long term.

There are different levels of curation -- some repositories require much more detailed metadata (information about the data) than others, and some (such as ICPSR) do certain checks and description by hand rather than automation.

Uncurated data are less uniformly described, and therefore may be more difficult to use in contexts beyond the work of the initial investigators. They may or may not be in sustainable file formats.

