Data Reshaping in R Programming Language

In this tutorial, we are going to learn about the data reshaping in R programming language with examples.
Submitted by Bhavya Sri Khandrika, on May 02, 2020

In general, the data reshaping is concerned with the organization of the considered data. Usually, the data will be rearranged in terms of rows and the corresponding columns. To accomplish this particular task the data must be processed. Now when it comes to the data processing one of the prior steps to be taken is to consider the input as the data frame. As we all are acquainted with the fact that the modification of the data frames is very comfortable. Typically, the users or the programmer can easily extract the data from the respective rows and the columns from a data frame. Well, the matter is crystal clear. But now if the situation prevails where the user has the data in a different format than the data frame then the problem starts enhancing its strength.

Now in such a case prevails then, there are various functions developed by the team to help the users to cross such obstacles. The foresightedness of the development team of R has led to a great invention of the several functions that enabled the user to work very comfortably on the R platform. Here as a part of data reshaping the user finds the following functions like the functions that deal with the actions like merging, splitting, and bringing a change in the number of rows and columns and vice versa in the input taken and eventually one can obtain the well-framed data frame.

Transpose of a Matrix in the R

R is well appreciated due to the versed structure and the attributes included in it. R allows its users to find or calculate the transpose of a matrix. To achieve this task one needs to get acknowledged over the syntax of the transpose function in the R language.

Syntax:

The syntax for finding the transpose of the matrix,

    t(Matrix/data frame)

This particular function not only supports the input as a matrix but also the data frame which means you can also compute the transpose of data frames.

Example:

# create a matrix and with the declared values
B <- matrix(seq.int(2.4, 1.3, -0.1), nrow = 6,
byrow =TRUE)   
print(B)

# now print the matrix after the transposing is completed
print("Matrix after transposing")
d <- t(B)
print(d)

Output

     [,1] [,2]
[1,]  2.4  2.3
[2,]  2.2  2.1
[3,]  2.0  1.9
[4,]  1.8  1.7
[5,]  1.6  1.5
[6,]  1.4  1.3
[1] "Matrix after transposing"
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]  2.4  2.2  2.0  1.8  1.6  1.4
[2,]  2.3  2.1  1.9  1.7  1.5  1.3

Joining if the rows and the columns in the data frame

R fosters the users in joining various numbers of the vectors to form a single data frame. To attain this result there are two main functions in R which assist the programmers in combining the rows and columns as a data frame. They are cbind() function and rbind() function.

The other benefit of using the rbind() function is that it also combines or merges the two data frame inputs as a single one. Merging of the data frames is often required in many applications which help the individuals to access the data from both individual data frames.

Syntax:

The syntax of the above-discussed functions, cbind() and rbind()

    cbind(vector1, vector2,vector3,vector4,.......vectorN)   
    rbind(dataframe1,  dataframe 2, dataframe 3, dataframe 4........ dataframeN) 

The below code will help the users in understanding the core concept of joining the rows and the columns to form a data frame.

Student_Name <- c("Siva","Bhargav","Ram","Swathi")
roll_no <- c("D18EI002","D18EC052","D18EE036","D18CE016")
Grades <- c('A','S','C','B') 	 

# Combining vectors into one data frame
info <- cbind(Student_Name,roll_no, Grades) 	 

# Printing data frame
print(info)

Output

     Student_Name roll_no    Grades
[1,] "Siva"       "D18EI002" "A"   
[2,] "Bhargav"    "D18EC052" "S"   
[3,] "Ram"        "D18EE036" "C"   
[4,] "Swathi"     "D18CE016" "B"  

As part of the data reshaping the user can create two individual data frames and can eventually combine them to a single one. Now let us create the other data frame which later will be appended to the first one shown above.

# Creating another data frame with similar columns
new.stu_data <- data.frame(
    Stydent_Name = c("Lasya","Sowmya"),
    roll_no = c("D18CE052","D18EI016"),
    Grades = c('S','A'),
    stringsAsFactors=FALSE
)

# Now let us give a header to the above dataframe 
# as the second dataframe
cat("****___ The Second DF___ ****\n")  

# The name is given as the second data frame so 
# now it's time to print the things that are 
# available in this second dataframe
print(new.stu_data)

Output

****___ The Second DF___ ****
  Stydent_Name  roll_no Grades
1        Lasya D18CE052      S
2       Sowmya D18EI016      A
# this is the first data frame considered
Student_Name <- c("Siva","Bhargav","Ram","Swathi")
roll_no <-   c("D18EI002","D18EC052","D18EE036","D18CE016")
Grades <- c('A','S','C','B')

# Combining vectors into one data frame
info <- cbind(Student_Name,roll_no, Grades)
# Printing data frame
print(info)

# the below is the second data frame 
# Creating another data frame with similar columns
new.stu_data <- data.frame(
    Student_Name = c("Lasya","Sowmya"),
    roll_no = c("D18CE052","D18EI016"),
    Grades = c('S','A'),
    stringsAsFactors=FALSE
)

# Now let us give a header to the above 
# dataframe as the second dataframe
cat("****___ The Second DF___ ****\n")  

# The name is given as the second data frame so now 
# it's time to print the things that are 
# available in this second dataframe
print(new.stu_data)

# Combining rows from both the data frames.
all.info <- rbind(info,new.stu_data)

# Printing a header.
cat("# # # The combined data frame\n")  
# Printing the result.
print(all.info)

Output

     Student_Name roll_no    Grades
[1,] "Siva"       "D18EI002" "A"   
[2,] "Bhargav"    "D18EC052" "S"   
[3,] "Ram"        "D18EE036" "C"   
[4,] "Swathi"     "D18CE016" "B"   
****___ The Second DF___ ****
  Student_Name  roll_no Grades
1        Lasya D18CE052      S
2       Sowmya D18EI016      A
# # # The combined data frame
  Student_Name  roll_no Grades
1         Siva D18EI002      A
2      Bhargav D18EC052      S
3          Ram D18EE036      C
4       Swathi D18CE016      B
5        Lasya D18CE052      S
6       Sowmya D18EI016      A

Now moving on further let us learn about how to merge the data frames?

There is a special attribute called merge() function which assists the users in the merging process of the two data frames. To attain this task there is something that needs to be concentrated on. Usually one will find the constraints in the merging process. The names of the columns declared in the two data frames must be identical to complete the merging process.

Example to depict the above concept

Considering the dataset of the people who have diabetes and diabetes level. This dataset represents the diabetes level of the women in the Pima Indian Women which is included in the 'MASS' library. Now our prime duty is to merge the values of the blood pressures and the body mass indices of the two datasets. Usually, as a part of the merging process, the two-column which needs to be merged are considered and after the final execution of the code will finally give the merged data set of the above two data frames.

Consider the following in which the library MASS is included:

library(MASS)  

merging_pima<- merge(x = Pima.te, y = Pima.tr,  
   by.x = c("bp", "bmi"),  
   by.y = c("bp", "bmi")  
)  

print(merging_pima)  
nrow(merging_pima)  

Output

   bp  bmi npreg.x glu.x skin.x ped.x age.x type.x npreg.y glu.y skin.y ped.y
1  60 33.8       1   117     23 0.466    27     No       2   125     20 0.088
2  64 29.7       2    75     24 0.370    33     No       2   100     23 0.368
3  64 31.2       5   189     33 0.583    29    Yes       3   158     13 0.295
4  64 33.2       4   117     27 0.230    24     No       1    96     27 0.289
5  66 38.1       3   115     39 0.150    28     No       1   114     36 0.289
6  68 38.5       2   100     25 0.324    26     No       7   129     49 0.439
7  70 27.4       1   116     28 0.204    21     No       0   124     20 0.254
8  70 33.1       4    91     32 0.446    22     No       9   123     44 0.374
9  70 35.4       9   124     33 0.282    34     No       6   134     23 0.542
10 72 25.6       1   157     21 0.123    24     No       4    99     17 0.294
11 72 37.7       5    95     33 0.370    27     No       6   103     32 0.324
12 74 25.9       9   134     33 0.460    81     No       8   126     38 0.162
13 74 25.9       1    95     21 0.673    36     No       8   126     38 0.162
14 78 27.6       5    88     30 0.258    37     No       6   125     31 0.565
15 78 27.6      10   122     31 0.512    45     No       6   125     31 0.565
16 78 39.4       2   112     50 0.175    24     No       4   112     40 0.236
17 88 34.5       1   117     24 0.403    40    Yes       4   127     11 0.598
   age.y type.y
1     31     No
2     21     No
3     24     No
4     21     No
5     21     No
6     43    Yes
7     36    Yes
8     40     No
9     29    Yes
10    28     No
11    55     No
12    39     No
13    39     No
14    49    Yes
15    49    Yes
16    38     No
17    28     No
[1] 17

The above example deals with the exact libraries that are present in the R language.



Comments and Discussions!

Load comments ↻





Copyright © 2024 www.includehelp.com. All rights reserved.