How to Split Data by Factors in R

How to Split Data by Factors in R ?

Answer

To split data by factors in R, you can use the split() function, which divides data into groups based on the levels of a factor. This is useful for analyzing subsets of data independently.

✐ Examples

1 Splitting a Data Frame by a Factor Representing Gender

In this example,

We start by creating a data frame named data which contains two columns: height and gender. The height column represents the heights of individuals, and the gender column represents their gender (with values 'Male' and 'Female').
Next, we use the split() function to split the data data frame by the gender factor. We pass the data data frame and the data$gender factor to the split() function. This creates a list where each element contains the subset of the data corresponding to one level of the gender factor.
We assign the result of the split() function to a variable named split_data.
We print the split_data to the console to see the data split by gender. This allows us to verify that the data has been correctly divided into subsets.

R Program

data <- data.frame(height = c(160, 170, 165, 155, 180, 175), gender = c('Female', 'Male', 'Female', 'Female', 'Male', 'Male'))
split_data <- split(data, data$gender)
print(split_data)

Output

$Female
  height gender
1    160 Female
3    165 Female
4    155 Female

$Male
  height gender
2    170   Male
5    180   Male
6    175   Male

2 Splitting a Data Frame by a Factor Representing Species

In this example,

We start by creating a data frame named species_data which contains two columns: weight and species. The weight column represents the weights of different animals, and the species column represents their species (with values 'Cat', 'Dog', and 'Bird').
Next, we use the split() function to split the species_data data frame by the species factor. We pass the species_data data frame and the species_data$species factor to the split() function. This creates a list where each element contains the subset of the data corresponding to one level of the species factor.
We assign the result of the split() function to a variable named split_species_data.
We print the split_species_data to the console to see the data split by species. This allows us to verify that the data has been correctly divided into subsets.

R Program

species_data <- data.frame(weight = c(4.5, 20.0, 2.3, 3.8, 25.0, 1.1), species = c('Cat', 'Dog', 'Bird', 'Cat', 'Dog', 'Bird'))
split_species_data <- split(species_data, species_data$species)
print(split_species_data)

Output

$Bird
  weight species
3    2.3    Bird
6    1.1    Bird

$Cat
  weight species
1    4.5     Cat
4    3.8     Cat

$Dog
  weight species
2   20.0     Dog
5   25.0     Dog

3 Splitting a Data Frame by a Factor Representing Education Level

In this example,

We start by creating a data frame named education_data which contains two columns: salary and education. The salary column represents the salaries of individuals, and the education column represents their education level (with values 'High School', 'Bachelor', and 'Master').
Next, we use the split() function to split the education_data data frame by the education factor. We pass the education_data data frame and the education_data$education factor to the split() function. This creates a list where each element contains the subset of the data corresponding to one level of the education factor.
We assign the result of the split() function to a variable named split_education_data.
We print the split_education_data to the console to see the data split by education level. This allows us to verify that the data has been correctly divided into subsets.

R Program

education_data <- data.frame(salary = c(50000, 60000, 70000, 80000, 55000, 75000), education = c('High School', 'Bachelor', 'Master', 'Bachelor', 'High School', 'Master'))
split_education_data <- split(education_data, education_data$education)
print(split_education_data)

Output

$`Bachelor`
  salary   education
2  60000  Bachelor
4  80000  Bachelor

$`High School`
  salary    education
1  50000 High School
5  55000 High School

$Master
  salary education
3  70000    Master
6  75000    Master

Summary

In this tutorial, we learned How to Split Data by Factors in R language with well detailed examples.

◀ Previous Next ▶

R Tutorials

R How-Tos

How to Split Data by Factors in R

How to Split Data by Factors in R ?

Answer

✐ Examples

1 Splitting a Data Frame by a Factor Representing Gender

R Program

Output

2 Splitting a Data Frame by a Factor Representing Species

R Program

Output

3 Splitting a Data Frame by a Factor Representing Education Level

R Program

Output

Summary

More R Factors Tutorials