Skip to Main Content
Drexel Library

Research Data Management Resources

What is Data Cleaning?

Data cleaning is the process of detecting, diagnosing, and editing faulty data. Data analysis is the process of systematically applying statistical and/or logical techniques to describe and illustrate, condense and recap, and evaluate data. 

OpenRefine

OpenRefine is a tool that helps with messy data by allowing you to clean it in a variety of ways. A good introduction document was created by Owen Stephens on behalf of the British library. The Louisiana State Library has a helpful presentation, and the University of Maryland CASCI also has a tutorial showing how to perform basic cleaning tasks. For more in-depth tutorials, OpenRefine curates a list of tutorials

Tabula

Tabula is a tool that can extract data from PDFs and convert it to spreadsheet format. Media Hack has a series of video tutorials on using the tool, and Northeastern University has a written tutorial on extracting tables. 

R

R is a free, open-source programming language that can be useful for data analysis in a variety of ways, from statistical analysis to data visualization. Data Camp has a free introductory course to using the software,  and Computerworld has a written beginner's introduction. For more in-depth help, and to answer specific uses of R, try one of two open-access books; R for Data Science by O'Reilly Media or R for Beginners by Emmanuel Paradis. You can also use Skillsport for Skillsoft Books to find books amd videos to learn more about R. 

Python

Python is a programming language that can perform a variety of functions. Drexel Libraries has a very thorough library guide on with many resources. Data Camp offers a free "Intro to Python for Data Science" class. For more in-depth help, and to answer specific uses of Python, try The Python Tutorial or one of the many tutorials listed on this Learning Python webpage.  You can also use Skillsport for Skillsoft Books to find books amd videos to learn more about Python.

SPSS

SPSS is software used for statistical analysis, and can be downloaded from Drexel IT. There is a full video course created by Research by Design you may find helpful. Kent State University has a comprehensive guide to using SPSS, and Barnard College produced an introductory document that details how to do a variety of specific functions in the software.The website SPSS Tutorials also has a variety of beginner tutorials, and more!