Python program to calculate Jaccard similarity

Jaccard similarity in Python: In this tutorial, we will learn what is Jaccard similarity, how to calculate it, and how to write a Python program to calculate Jaccard similarity. By IncludeHelp Last updated : August 13, 2023

Jaccard similarity

The Jaccard similarity is used to measure the similarity between two data sets to see find the shared and distinct members. The Jaccard similarity is calculated by dividing the size of the intersection and the size of the union of two sets. Lear more about on Jaccard similarity at learndatasci

Jaccard similarity formula

Here is the formula to find Jaccard similarity is:

Jaccard similarity formula

Example 1: Python program to calculate Jaccard similarity

In this program, we are finding the Jaccard similarity of the given data sets.

set1 = {10, 20, 30, 40, 50}
set2 = {10, 20, 30, 80, 90}

# Finding the intersection of sets
intersection_result = set1.intersection(set2)

# Finding the union of sets
union_result = set1.union(set2)

# Printing the values
print("AnB = ", intersection_result)
print("AUB = ", union_result)

print(
    "Jaccard similarity: J(set1, set2) = ",
    float(len(intersection_result)) / float(len(union_result)),
)

Output

AnB =  {10, 20, 30}
AUB =  {40, 10, 80, 50, 20, 90, 30}
Jaccard similarity: J(set1, set2) =  0.42857142857142855

Example 2: Python program to calculate Jaccard similarity

In this program, we are writing a user-define function, passing the sets, and returning the Jaccard similarity.

# Function to calculate Jaccard similarity
def calculate_Jaccard_similarity(set1, set2):
    # size of the intersection of the sets
    intersection = len(set1.intersection(set2))
    # size of the union of the sets
    union = len(set1.union(set2))

    # Calculating Jaccard similarity
    Jaccard_similarity = intersection / union

    # Retuning it
    return Jaccard_similarity

# Main code i.e, call function here
set1 = {10, 20, 30, 40, 50}
set2 = {10, 20, 30, 80, 90}

result = calculate_Jaccard_similarity(set1, set2)
print("Jaccard similarity is:", result)

Output

Jaccard similarity is: 0.42857142857142855

Related Programs



Comments and Discussions!

Load comments ↻





Copyright © 2024 www.includehelp.com. All rights reserved.