R Basics

Introduction

To define a variable, we may use the assignment symbol, <-. There are two ways to see the value stored in a variable:

  1. type the variable name into the console and hit Return, or
  2. use the print() function by typing print(variable_name) and hitting Return.

Objects are things that are stored in named containers in R. They can be variables, functions, etc.

The ls() function shows the names of the objects saved in your workspace.

Code: solving (\(x^2+x-1)\)

# assigning values to variables
a <-1
b <-1
c <--1

# solving the quadratic equation
(-b + sqrt(b^2 - 4*a*c))/(2*a)
(-b - sqrt(b^2 - 4*a*c))/(2*a)

Functions

In general, to evaluate a function we need to use parentheses. If we type a function without parenthesis, R shows us the code for the function. Most functions also require an argument, that is, something to be written inside the parenthesis. To access help files, we may use the help function, help(function_name), or write the question mark followed by the function name, ?function_name.

The help file shows you the arguments the function is expecting, some of which are required and some are optional. If an argument is optional, a default value is assigned with the equal sign. The args() function also shows the arguments a function needs.To specify arguments, we use the equals sign. If no argument name is used, R assumes you’re entering arguments in the order shown in the help file.

Creating and saving a script makes code much easier to execute. To make your code more readable, use intuitive variable names and include comments (using the “#” symbol) to remind yourself why you wrote a particular line of code.

Data Types

The function class() helps us determine the type of an object. Data frames can be thought of as tables with rows representing observations and columns representing different variables.

To access data from columns of a data frame, we use the dollar sign symbol, $, which is called the accessor. A vector is an object consisting of several entries and can be a numeric vector, a character vector, or a logical vector.

We use quotes to distinguish between variable names and character strings. Factors are useful for storing categorical data, and are more memory efficient than storing characters.

Creating numeric object

x <- 12.7
y <- 950

Creating integer object

z <- 8L

Creating character object

a <- "apple"
b <- "7"

Creating date object

e <- as.Date("2016-09-05")
f <- as.POSIXct("2018-04-05")

Data Structures There are a few different data structures in R that are crucial to understand, as they directly pertain to the use of data! These include vectors, matrices, and dataframes. We'll discuss how to tell the difference between all of these, along with how to create and manipulate them.

R - Data Structures

Basic Flow Control

Flow Control includes different kinds of loops that you can use in R, such as the if/else, for, and while loops. While many of the concepts are very similar to how flow control and loops are used in other programming languages, they may be written differently in R.

R - Flow Control

Packages

To install package

install.packages("tidyverse")

To load library from package

library("ggplot2")

to change folder for libraries

.libPaths(c("C:\\Burmistrov\\R", .libPaths()))

If error with unpacking, this might actually be an issue with your antivirus not having enough time to check the file before it is moved. Try telling R to pause for a minute when installing before moving the file over

run this code:

trace(utils:::unpackPkgZip, edit=TRUE)

Then go down to about line 140, you should see the following:

ret <- unlink(instPath, recursive = TRUE, force = TRUE)
if (ret == 0) {
Sys.sleep(0.5)

change the number in Sys.sleep from 0.5 to 2. For a huge package maybe do 3, but I've found almost everything works if you tell it to wait 2 seconds.

Note, this will reset the next time you restart your R session, so you will need to do it again.

Data Import and Export

Data scientists are often have to import and export data that comes from external sources in and out of R. Data can come and go in many different forms, and while we'll not cover them all here, we'll touch on some of the most common forms.

Data import and export are truly one subsection in R, because most of the time, the functions are opposites: for example, read.csv() takes in the character string name of a .csv file, and you save the output as a dataset in your environment, while write.csv() takes in the name of the dataset in your environment and the character string name of a file to write to.

R - Data Import and Export

Data Visualization

Data visualizations are very important in data science. They are used as a part of Exploratory Data Analysis (EDA), to familiarize yourself with data, to examine the distributions of variables, to identify outliers, and to help guide data cleaning and analysis. They are also used to communicate results to a variety of audiences, from other data scientists to customers.

R - Data Visualization

Page last modified on July 28, 2021, at 03:24 PM
Powered by PmWiki