When you plan to collect data from human (or some animal) subjects for a study or an experiment, you must follow the guidelines provided by your Institutional Review Board (IRB). Whitman's basic guidelines can be found here. These are meant to protect the welfare, rights, and privacy of the subjects.You may also need to check with an IRB to use secondary data that are publicly available.
Some repositories curate their data -- others do not. What's the difference? Curated data have generally been checked by the repository to make sure that they conform to certain standards for presenting and describing the data, and for preserving data for future use. This helps to ensure that new users can understand what the data represent, and makes these data easier to share and reuse in the long term.
There are different levels of curation -- some repositories require much more detailed metadata (information about the data) than others, and some (such as ICPSR) do certain checks and description by hand rather than automation.
Uncurated data are less uniformly described, and therefore may be more difficult to use in contexts beyond the work of the initial investigators. They may or may not be in sustainable file formats.
The links for the Social Sciences, Sciences, and Humanities in the center of the page are just a few common starting points. To find research data in a specific field or subfield, search in the Registry of Research Data Repositories (re3data).
If you're looking more generally for data sets that you can jump in and start analyzing or visualizing, StatSci.org has a collection of data sets from various institutions and textbooks.
© 2014 Whitman College Penrose Library |