Why should we make a copy of a DataFrame in Pandas?

Learn, why should we make a copy of a DataFrame in Pandas?
Submitted by Pranit Sharma, on May 19, 2022

Pandas is a special tool that allows us to perform complex manipulations of data effectively and efficiently. Inside pandas, we mostly deal with a dataset in the form of DataFrame. DataFrames are 2-dimensional data structures in pandas. DataFrames consists of rows, columns, and the data.

During the analysis of data, we perform numerous operations on DataFrame, which can result in a drastic change in the original DataFrame. It is known that a copy is made by using the assignment operator in python but it is not true actually, using the assignment operator leads to sharing of references from one object to other. If we perform certain operations on these shared objects, it will modify the original object as it holds the same reference or it will raise an error. To overcome this problem, we should always make a copy of a DataFrame in pandas.

To work with pandas, we need to import pandas package first, below is the syntax:

import pandas as pd

Let us understand with the help of an example.

# Importing pandas package
import pandas as pd

# Create DataFrame
df = pd.DataFrame({'Girl':['Monica'],'Boy':['Chandler']})
print("Original DataFrame:\n",df,"\n")


Example 1: Make a copy of a DataFrame

Now let us perform some operations without making a copy of DataFrame and we will observe that it will modify the original DataFrame.

# Assign df to df_copy
df_copy = df

# Display df_copy
print("Copied DataFrame:\n",df_copy,"\n")

# Update the value of group 1 and group 2 
# in original DataFrame
df_copy['Girl']= 'Rachel'
df_copy['Boy'] = 'Ross'

print("Modified Original DataFrame\n",df)


Example 2: Make a copy of a DataFrame

Now we will make a deep copy of DataFrame and perform some operations and will observe that it will not modify the original DataFrame.

# Using df.copy to make a copy
df_2 = df.copy(deep=True)

# Display df_2
print("A deep copy of original DataFrame:\n",df_2,"\n")

# Update values of deep copy
df_2['Girl'] = 'Phoebe'
df_2['Boy'] = 'Joey'

# Display updated deep copy and also original DataFrame 
# which is not modified this time
print("Modified deep copy:\n",df_2,"\n")
print("Unmodified original DataFrame:\n",df)


Example 3: Make a copy of a DataFrame

Python Pandas Programs »


Comments and Discussions!


Languages: » C » C++ » C++ STL » Java » Data Structure » C#.Net » Android » Kotlin » SQL
Web Technologies: » PHP » Python » JavaScript » CSS » Ajax » Node.js » Web programming/HTML
Solved programs: » C » C++ » DS » Java » C#
Aptitude que. & ans.: » C » C++ » Java » DBMS
Interview que. & ans.: » C » Embedded C » Java » SEO » HR
CS Subjects: » CS Basics » O.S. » Networks » DBMS » Embedded Systems » Cloud Computing
» Machine learning » CS Organizations » Linux » DOS
More: » Articles » Puzzles » News/Updates

© some rights reserved.