Tables
There are built-in functions in R for the data import and export of many common file types, such as read.table() for .txt files, read.csv() for .csv files, and read.delim() for other types of delimited files (such as tab or | delimited, where | is the pipe operator appropriated as a separator).
We can import data directly from GitHub by using the read.table() function. If we input the URL where the dataset is stored, the function will download it for you, as shown in the following example:
It may be necessary to export the table out of R, perhaps to email it to a colleague for their use or to re-upload it on GitHub. The opposite of read.table() is write.table(). Take a look at the following example:
This will write out the students_text dataset we're using in R to a file called students_text_out in our working directory folder on our machine.
Most of the other data import and export functions in R work much like read. table() and write.table(). Let's import the students' data from the .csv format, which can be done in three different ways, using read.table(), read.csv(), and read_csv()
Excel Spreadsheets
One last common data type is Microsoft Excel spreadsheets, which are usually saved with the file extension .xlsx. We can use the xlsx R library, which contains read. xlsx() to import the students.xlsx file from GitHub. There's only one sheet in this file, but if there was more than one, we could specify which sheet to read in with the sheet argument.
- Navigate to the .xlsx version of the students dataset on GitHub, at the following link: https://github.com/TrainingByPackt/R-Programming-Fundamentals/blob/master/lesson1/students.xlsx.
- Hit View Raw and it will automatically download to your computer.
- Import spreadsheet
library(openxlsx)
students_xlsx <- read.xlsx("students.xlsx")
Create a new variable in students_xlsx, and export
write.xlsx(students_xlsx, "students_xlsx_out.xlsx")