the different types of data in R and rstudio
Share
How do I identify R data types?
There are several types of data structures in R, each with specific characteristics and uses. To get the most out of the R language and RStudio software, it is essential to understand them and know how to interpret and use them. In this article, we will explore the different types of data in RStudio, focusing on R data structures, classes or data types of a variable, and functions for examining R data.
R data structures or variable types
There are several data structures in R, each with specific characteristics and uses. The most common data structures are vectors, arrays, lists and dataframes.
- Vectors are used to store data of the same type, such as numbers, strings or Booleans.
- Matrices are used to store data of the same type in a two-dimensional structure.
- Arrays are used to store data of the same type in a multi-dimensional structure.
- Lists are used to store elements of different types or different data structures.
- Dataframes are used to store structured data, such as database data.
These different variables also have “classes” or data types, and this is generally what we are interested in when we talk about data types in Rstudio
The Class or data types of a variable
The R language has 5 main data types and they are
- Numbers are used to store numerical data, such as integers or decimals.
- Strings are used to store text data.
- Booleans are used to store true/false data.
- Factors are used to store categorical data.
- Dates are used to store date and time data.
You will find below examples of atomic vectors of characters, numeric vectors, integer vectors, etc.
character: “a”, “eat” “New-York”
numeric: “2” “-185.3” “15.5”
integer: 25L (the L tells R to store it as an integer)
logical : TRUE / FALSE
complex: 1+6i (complex numbers, usually used as an abbreviation to display long values)
Functions to inspect R data
R and RStudio have several useful functions for examining the characteristics of vectors, variables and other objects. The most useful and most commonly used in analyses are the following functions:
class() – determines the class or data type of a variable.
typeof() – determines the underlying data type of a variable.
length() – determines the number of elements in a variable.
head() – displays the first few rows of a variable, while the tail() function displays the last few rows.
summary() – produces a statistical summary of the data in a variable, such as mean, median, variance, etc.
unique() – allows you to determine the unique values in a variable.
table() – creates an array of frequencies for a variable.
is.null() – determines whether a variable is empty or not.
cbind() and rbind() – allow you to combine variables horizontally or vertically respectively
Many other functions exist to examine data in R, but we can consider the above functions as the most commonly used ones to explore and understand data. It is important to remember that the choice of function will depend on your specific needs and the structure of your data. It is therefore essential to understand the data before using a function to examine the data.
We hope you find this information useful!