Home »
Python »
Python Programs
Label encoding across multiple columns in scikit-learn
Python Pandas | Label Encoding: Learn about the label encoding across multiple columns in scikit-learn.
Submitted by Pranit Sharma, on May 23, 2022
Label Encoding is the process of converting the labels into a number format so as to make them available to the machine in a machine-readable form. Machine learning algorithms can then decide in a better way how those labels must be operated.
Here, we will use sklearn library, which is used for applying machine learning process, usually the algorithms used for training the model and testing the model falls under sklearn library.
We will now understand with the help of an example that how we can do label encoding across multiple columns in sklearn.
For this purpose, we will import preprocessing function from sklearn library which will use Labelencoder method in order to achieve label encoding.
Let us understand with help of an example,
# Importing pandas package
import pandas as pd
# Importing Sklearn library
import sklearn
from sklearn import preprocessing
# Creating a dictionary
d={
"Name":['Hari','Mohan','Neeti','Shaily','Ram','Umesh','Shirish','Rashmi','Pradeep','Neelam','Jitendra','Manoj','Rishi'],
"Age":[25,36,26,21,30,33,35,40,39,45,42,39,48],
"Gender":['Male','Male','Female','Female','Male','Male','Male','Female','Male','Female','Male','Male','Male'],
"Profession":['Doctor','Teacher','Singer','Student','Engineer','CA','Cricketer','Teacher','Teacher','Politician',
'Doctor','Manager','Clerk'],
"Title":['Mr','Mr','Ms','Ms','Mr','Mr','Mr','Ms','Mr','Ms','Mr','Mr','Mr'],
"Salary":[200000,50000,500000,0,100000,75000,10000000,50000,50000,200000,200000,150000,15000],
"Location":['Amritsar','Indore','Mumbai','Bhopal','Gurugram','Pune','Banglore','Ranchi','Surat','Chennai','Shimla','Kolkata','Raipur'],
"Marriage Status":[0,1,1,0,1,0,0,1,1,1,0,1,0]
}
# Now we will create DataFrame
df=pd.DataFrame(d)
# Encoding all the columns
df.apply(preprocessing.LabelEncoder().fit_transform)
# Viewing the created DataFrame
print("Created DataFrame:\n")
print(df,"\n\n")
Output:
Python Pandas Programs »