Home »
Python »
Python Programs
Concatenate strings from several rows using pandas groupby
Given a Pandas DataFrame, we have to concatenate strings from several rows using pandas groupby.
Submitted by Pranit Sharma, on June 17, 2022
Pandas is a special tool which allows us to perform complex manipulations of data effectively and efficiently. Inside pandas, we mostly deal with a dataset in the form of DataFrame. DataFrames are 2-dimensional data structure in pandas. DataFrames consists of rows, columns and the data.
A string in pandas can also be converted into pandas DataFrame with the help StringIO method.
Here, we are going to learn how to concatenate strings from several rows using pandas groupby?
Pandas Groupby is a method which groups the value of same column values. Suppose we have a product name ‘Frooti’ in our DataFrame and its sales in number. We may have this product multiple times in the DataFrame hence groupby methods groups these multiple values into a single value.
To work with pandas, we need to import pandas package first, below is the syntax:
import pandas as pd
Let us understand with the help of an example,
Python code to concatenate strings from several rows using pandas groupby
# Importing pandas package
import pandas as pd
# Importing StringIO module from io module
from io import StringIO
# Creating a string
string= StringIO("""
Name;Age;Messege
Harry;20;OHH!!
Tom;23;JUST DO
Harry;20;YEAH
Nancy;20;WOW!
Tom;23;IT
""")
# Reading String in form of csv file
df=pd.read_csv(string, sep=";")
# Printing the DataFrame
print("String into DataFrame:\n\n",df,"\n\n")
# Applying function to concatenate strings
result = df.groupby(['Name','Age'])['Messege'].apply(lambda x: ','.join(x)).reset_index()
# Display result
print("Concatenated strings where colum values are same:\n\n",result)
Output:
Python Pandas Programs »