Skip to main content
Documents
Share
LinkedIn
Facebook
Twitter
Copy
R DocumentationR InterfaceData Input in RData Management in RStatistics in RGraphs in R

Creating New Variables in R

Use the assignment operator <- to create new variables. A wide array of operators and functions are available here.

# Three examples for doing the same computations

mydata$sum <- mydata$x1 + mydata$x2
mydata$mean <- (mydata$x1 + mydata$x2)/2

attach(mydata)
mydata$sum <- x1 + x2
mydata$mean <- (x1 + x2)/2
detach(mydata)

mydata <- transform( mydata,
sum = x1 + x2,
mean = (x1 + x2)/2
)

(To practice working with variables in R, try the first chapter of this free interactive course.)

Recoding variables

In order to recode data, you will probably use one or more of R's control structures.

# create 2 age categories

mydata$agecat <- ifelse(mydata$age > 70,
c("older"), c("younger"))

# another example: create 3 age categories

attach(mydata)
mydata$agecat[age > 75] <- "Elder"
mydata$agecat[age > 45 & age <= 75] <- "Middle Aged"
mydata$agecat[age <= 45] <- "Young"
detach(mydata)

Renaming variables

You can rename variables programmatically or interactively.

# rename interactively
fix(mydata) # results are saved on close

# rename programmatically
library(reshape)
mydata <- rename(mydata, c(oldname="newname"))

# you can re-enter all the variable names in order
# changing the ones you need to change.the limitation
#
is that you need to enter all of them!
names(mydata) <- c("x1","age","y", "ses")

Variable types in R

R supports a diverse range of variable types, each tailored to handle specific data forms:

  • Numeric: These represent numbers and can be either whole numbers or decimals.
  • Character: This type is for textual data or strings.
  • Logical: These are binary and can take on values of TRUE or FALSE.
  • Factor: Ideal for categorical data, factors can help in representing distinct categories within a dataset.
  • Date: As the name suggests, this type is used for date values.

When creating new variables, it's essential to ensure they are of the appropriate type for your analysis. If unsure, you can use the class() function to check a variable's type.

Checking and changing variable types

Ensuring your variables are of the correct type is crucial for accurate analysis:

  • Checking Variable Type: The class() function can help you determine the type of a variable.
  • Changing Variable Type: If you need to convert a variable from one type to another, R provides functions like as.numeric(), as.character(), and as.logical().

Variable scope

Understanding the scope of a variable is essential:

  • Global Variables: These are accessible throughout your entire script or session.
  • Local Variables: These are confined to the function or environment they are created in and can't be accessed outside of it. When creating new variables, especially within functions, always be mindful of their scope to avoid unexpected behaviors.

Using variables with functions

Variables play a central role when working with functions:

  • Passing Variables: You can provide variables as arguments to functions, allowing for dynamic computations based on variable values.
  • Storing Function Outputs: Functions can return values, and you can assign these values to new or existing variables for further analysis.

Variable operations

Depending on their type, you can perform various operations on variables:

  • Arithmetic Operations: For numeric variables, you can carry out standard mathematical operations like addition, subtraction, multiplication, and division.
  • String Operations: For character variables, operations like concatenation allow you to combine multiple strings into one.

Recoding variables

Recoding involves changing the values of a variable based on certain conditions. For instance, you might want to group ages into categories like "young", "middle-aged", and "senior". R offers various control structures to facilitate this process. When recoding, always ensure that the new categories or values make logical sense and serve the purpose of your analysis.

Renaming variables

There might be instances where you'd want to rename variables for clarity or consistency. R provides two primary ways to rename variables:

  • Interactively: You can use the fix() function to open a data editor where you can rename variables directly.
  • Programmatically: There are various packages and functions in R that allow you to rename variables within your script. When renaming, ensure that the new names are descriptive and adhere to R's variable naming conventions.

Frequently Asked Questions (FAQs) about Variables in R

Q: What's the difference between <- and = for assignment in R?

A: Both <- and = can be used for assignment in R. However, <- is the more traditional and preferred method, especially in scripts and functions. The = operator is often used within function calls to specify named arguments.

Q: How can I check the type of a variable in R?

A: You can use the class() function to determine the type or class of a variable. This function will return values like "numeric", "character", "factor", and so on, depending on the variable's type.

Q: I mistakenly assigned a character value to a numeric variable. How can I correct it?

A: R provides type conversion functions like as.numeric(), as.character(), and as.logical(). You can use these functions to convert a variable to the desired type.

Q: What does "recoding variables" mean?

A: Recoding refers to the process of changing or transforming the values of a variable based on certain criteria. For instance, converting a continuous age variable into age categories (e.g., "young", "middle-aged", "senior") is an example of recoding.

Q: How can I rename a variable in my dataset?

A: R offers multiple ways to rename variables. You can do it interactively using the fix() function, which opens a data editor. Alternatively, there are various R packages and functions that allow for programmatic renaming of variables.

Q: Are variable names in R case-sensitive?

A: Yes, variable names in R are case-sensitive. This means that myVariable, MyVariable, and myvariable would be treated as three distinct variables.

Q: Can I use spaces in variable names?

A: It's not recommended to use spaces in variable names in R. Instead, you can use underscores (_) or periods (.) to separate words in variable names, like my_variable or my.variable.

Q: How do I delete or remove a variable from my workspace?

A: You can use the rm() function followed by the variable name to remove it from your workspace. It's a good practice to clear unnecessary variables to free up memory.

Q: What's the difference between local and global variables?

A: Local variables are confined to the function or environment they are created in and can't be accessed outside of it. In contrast, global variables are accessible throughout your entire script or R session.

Q: How can I view all the variables currently in my workspace?

A: You can use the ls() function to list all the variables currently present in your workspace.