Importing data into R to start your analyses: it should be the easiest step. Unfortunately, this is almost never the case. Data can come in all sorts of formats, ranging from flat files and statistical software files to databases and web data. Knowing which approach to use is key to getting started with the actual analysis. In this course, you will learn all the basics on how to load data into R so you can get up and running in no time!
Lots of data comes in the form of flat files: simple tabular text files. Learn how to import all common formats of flat file data with base R functions and the dedicated readr and data.table packages.
Excel is a very widely used data analysis tool. If you prefer to do your analyses in R, though, you'll need an understanding of importing CSV data into R. This chapter will explain you how to use readxl and gdata to do so. The XLConnect package that takes all of this one step further, will also be discussed.
Next to R, there are also other commonly used statistical software packages: SAS, STATA and SPSS. Each of them has their own file format. Learn how to use the haven and foreign packages to get them into R with remarkable ease!
Many companies store their information in relational databases. The R community has also developed R packages to get data from these architectures. You'll learn how to connect to a database, how to retrieve data from it, and how to make things more efficient by performing a part of your computations on the database side.
More and more of the information that data scientists are using, resides on the web. Importing this data into R requires an understanding of protocols and typical data formats used on the web. In this chapter, you'll get a crash course in HTTP, learn to perform your own HTTP requests from inside R and get to know a popular web data format: JSON.
Data Science Instructor at DataCamp
“I've used other sites—Coursera, Udacity, things like that—but DataCamp's been the one that I've stuck with.”
Devon Edwards Joseph
Lloyds Banking Group
“DataCamp is the top resource I recommend for learning data science.”
Harvard Business School
“DataCamp is by far my favorite website to learn from.”
Decision Science Analytics, USAA