To split data by factors in R, you can use the split()
function, which divides data into groups based on the levels of a factor. This is useful for analyzing subsets of data independently.
In this example,
data
which contains two columns: height
and gender
. The height
column represents the heights of individuals, and the gender
column represents their gender (with values 'Male'
and 'Female'
).split()
function to split the data
data frame by the gender
factor. We pass the data
data frame and the data$gender
factor to the split()
function. This creates a list where each element contains the subset of the data corresponding to one level of the gender
factor.split()
function to a variable named split_data
.split_data
to the console to see the data split by gender. This allows us to verify that the data has been correctly divided into subsets.data <- data.frame(height = c(160, 170, 165, 155, 180, 175), gender = c('Female', 'Male', 'Female', 'Female', 'Male', 'Male'))
split_data <- split(data, data$gender)
print(split_data)
$Female height gender 1 160 Female 3 165 Female 4 155 Female $Male height gender 2 170 Male 5 180 Male 6 175 Male
In this example,
species_data
which contains two columns: weight
and species
. The weight
column represents the weights of different animals, and the species
column represents their species (with values 'Cat'
, 'Dog'
, and 'Bird'
).split()
function to split the species_data
data frame by the species
factor. We pass the species_data
data frame and the species_data$species
factor to the split()
function. This creates a list where each element contains the subset of the data corresponding to one level of the species
factor.split()
function to a variable named split_species_data
.split_species_data
to the console to see the data split by species. This allows us to verify that the data has been correctly divided into subsets.species_data <- data.frame(weight = c(4.5, 20.0, 2.3, 3.8, 25.0, 1.1), species = c('Cat', 'Dog', 'Bird', 'Cat', 'Dog', 'Bird'))
split_species_data <- split(species_data, species_data$species)
print(split_species_data)
$Bird weight species 3 2.3 Bird 6 1.1 Bird $Cat weight species 1 4.5 Cat 4 3.8 Cat $Dog weight species 2 20.0 Dog 5 25.0 Dog
In this example,
education_data
which contains two columns: salary
and education
. The salary
column represents the salaries of individuals, and the education
column represents their education level (with values 'High School'
, 'Bachelor'
, and 'Master'
).split()
function to split the education_data
data frame by the education
factor. We pass the education_data
data frame and the education_data$education
factor to the split()
function. This creates a list where each element contains the subset of the data corresponding to one level of the education
factor.split()
function to a variable named split_education_data
.split_education_data
to the console to see the data split by education level. This allows us to verify that the data has been correctly divided into subsets.education_data <- data.frame(salary = c(50000, 60000, 70000, 80000, 55000, 75000), education = c('High School', 'Bachelor', 'Master', 'Bachelor', 'High School', 'Master'))
split_education_data <- split(education_data, education_data$education)
print(split_education_data)
$`Bachelor` salary education 2 60000 Bachelor 4 80000 Bachelor $`High School` salary education 1 50000 High School 5 55000 High School $Master salary education 3 70000 Master 6 75000 Master
In this tutorial, we learned How to Split Data by Factors in R language with well detailed examples.