Home »
R Language
Factors in R Language
In this tutorial, we are going to learn about the Factors in R Language, Characteristics of a factor, Creation of a factor, etc., with examples.
Submitted by Bhavya Sri Khandrika, on December 06, 2020
The term 'factor' is defined as a structure of data, which is used for areas that receive only predefined determinate values of a certain specific category. The factors are the variables, which alters a limited amount of diverse standards. The factors of language 'R' are the data objects, which are made use to sort the figures and to store it on numerous levels.
These factors are used to store both the integers and the strings. They are expedient in the columns, in which it is having a limited degree of the sum of values of unique numbers, and R always categorizes stages in alphabetical order by default. For example, we can take the following "Man, "Woman" and False, True, etc. The primary usage of these factors is that they are beneficial in the analysis of data for modeling the statistical representation.
Characteristics of a factor in R language
The following are the qualities of a factor:
- X: It is to be taken as an input vector, which is to be transmuted into a factor.
- Labels: They are the character vectors, which resembles the number of labels.
- Level: A level is an input vector, which characterizes the set of values of unique numbers, which are engaged as X, to be transformed into a factor.
- Ordered: It is a rational characteristic of a factor, which regulates if the levels are ordered or not.
- Exclude: It is made usage to stipulate the value, which we desire to be excluded.
- Nmax: Nmax is made use to postulate the superior bound for the supreme quantity of level.
Example:
# Initially we need to create an input for our program
data <- c("E","W","E","N","N","E","W","W","W","E","N")
print(data)
#Checks whether the considered input is vector or factor
print(is.factor(data))
# Apply the factor function to the above created vector.
factor_data <- factor(data)
#The factor function enables the user to
#convert the vector into a factor.
print(factor_data)
#Checks whether the displayed output
#is vector or factor
print(is.factor(factor_data))
Output:
[1] "E" "W" "E" "N" "N" "E" "W" "W" "W" "E" "N"
[1] FALSE
[1] E W E N N E W W W E N
Levels: E N W
[1] TRUE
Creation of a factor in R Language
In the R language, it is fairly modest to make a factor. There are two steps in which the process of creating a factor of 'R' language is completed.
- The creation of a vector, which we use as an input.
- The process in which the vector is converted into a factor.
R arranges the factor() function to alter the vector into factors.
Syntax:
Factor data<- factor (vector)
Let us perceive an instance to apprehend how factor() function is used:
Example 1:
# create a vector to use it as an input.
data <- c ("hot", "cold", "wet", "cold", "hot", "dry", "cold", "hot", "dry", "wet", "dry")
print(data)
print(is.factor(data))
# applying the factor function.
factor_data<- factor (data)
print(factor_data)
print(is.factor(factor_data))
Output:
[1] "hot" "cold" "wet" "cold" "hot" "dry" "cold" "hot" "dry" "wet"
[11] "dry"
[1] FALSE
[1] hot cold wet cold hot dry cold hot dry wet dry
Levels: cold dry hot wet
[1] TRUE
Example 2:
# We should Make two vectors which are not similar in their lengths.
data <- c ("bat", "hat", "bat", "cat", "cat", "bat", "hat", "hat", "hat", "bat", "hat")
print(data)
print(is.factor(data))
# Now we need the factor function to be applied.
factordata <- factor(data)
print(factordata)
print(is.factor(factordata))
Output:
[1] "bat" "hat" "bat" "cat" "cat" "bat" "hat" "hat" "hat" "bat" "hat"
[1] FALSE
[1] bat hat bat cat cat bat hat hat hat bat hat
Levels: bat cat hat
[1] TRUE
The function of factors in Data Frame
The R extravagances the text column as a data, which is definite and creates the factors on it, on designing any data frame with a column of text data.
Example 1:
# We should Make two vectors which are not similar
#in their lengths.
height <- c (134,154,164,133,163,145,125)
weight <- c (49, 48, 67, 54, 68, 51, 41)
gender <- c("female", "female", " male ", " male ", "female", " male ", "female")
# Now we need to create the data frame.
input_data <- data.frame(height, weight, gender)
print(input_data)
# Examine if the column of sexual category is a factor.
print(is.factor(input_data$gender))
# Print the gender column so see the levels.
print(input_data$gender)
Output:
S.no height weight gender
1 134 49 female
2 154 48 female
3 164 67 male
4 133 54 male
5 163 68 female
6 145 51 male
7 125 41 female
[1] TRUE
[1] female female male male female male female
Levels: male female
Example 2:
# We should Make two vectors which are not
#similar in their lengths.
marks <- c (133,155,163,134,143,155,145)
rollno <- c (47, 46, 65, 58, 68, 53, 49)
student <- c("classA", " classB ", " classC ", "classA ", " classB ", " classA ", " classC")
# Now we need to create the data frame.
inputdata <- data.frame(marks, rollno, student)
print(inputdata)
# Examine if the column of sexual category is a factor.
print(is.factor(inputdata$student))
# Print the gender column so see the levels.
print(inputdata$student)
Output:
marks rollno student
1 133 47 classA
2 155 46 classB
3 163 65 classC
4 134 58 classA
5 143 68 classB
6 155 53 classA
7 145 49 classC
[1] TRUE
[11] classA classB classC classA classB classA classC
Levels: classA classB classC
The amendment of factor in the R Language
Just like the data frames, R allows us to change the factor in it. We can adapt the assessment of a factor by merely passing on it again. In R, we cannot select values outside of its pre distinct stages that means we cannot enclosure value if its level does not exist on it. For this persistence, we have to make a level of that value, and then we can sum it up to our factor.
Let us see an instance to comprehend how the alteration is complete in factors:
Example 1:
# we need to create a vector which can be used as input.
data<- c("ball", "dad", "cat", "dad", "ball")
# Now we need to apply the factor function.
factordata<- factor(data)
# we should Print all elements of the factor
print(factordata)
# Then we must Change the 4th element of factor with Sumit
factordata[4] <-"cat"
print(factordata)
#printing the fourth element
factordata[4]
# we should not allocate values outside the levels
print(factordata)
# the next step is to add the value to the level
levels(factordata) <- c(levels(factordata),"mom")
# We need to Add a new level
factordata[4] <- "mom"
print(factordata)
Output:
[1] ball dad cat dad ball
Levels: ball cat dad
[1] ball dad cat cat ball
Levels: ball cat dad
[1] cat
Levels: ball cat dad
[1] ball dad cat cat ball
Levels: ball cat dad
[1] ball dad cat mom ball
Levels: ball cat dad mom
Example 2:
# we need to create a vector which can be used as input.
data <- c ("hot", "bot", "pot", "bot", "hot")
# Now we need to apply the factor function.
factordata<- factor(data)
# we should Print all elements of the factor
print(factordata)
# Then we must Change the 4th element of factor with map
factordata[4] <-"pot"
print(factordata)
#printing the fourth element
factordata[4]
# we should not allocate values outside the levels
print(factordata)
# the next step is to add the value to the level
levels(factordata) <- c(levels(factordata),"map")
# We need to Add a new level
factordata[4] <- "map"
print(factordata)
Output:
[1] hot bot pot bot hot
Levels: bot hot pot
[1] hot bot pot pot hot
Levels: bot hot pot
[1] pot
Levels: bot hot pot
[1] hot bot pot pot hot
Levels: bot hot pot
[1] hot bot pot map hot
Levels: bot hot pot map
The Generation of Factor Levels in R
We can make the levels in factor by using the gl() function in R. This function takes place in the point of view of three parameters and they are n, k, and labels. Here, n, and k are the integers, which specify how many levels we desire and how many intervals each level is requisite.
Syntax:
Gl (n, k, labels)
Parameters:
- n(int): specifies the number of levels.
- k(int): specifies the number of replications.
- label(string) is a vector of labels for the resulting factor levels.
Example 1:
gen_factor<- gl (3, 5, labels=c ("cat ", "net", "pen"))
gen_factor
Output:
[1] cat cat cat cat cat net net net net net pen pen pen pen pen
Levels: cat net pen