Home » Machine Learning/Artificial Intelligence

Clustering (Introduction, Types and Advantages) in Machine Learning

Machine Learning | Clustering: Here, we are going to learn about the introduction, types and advantages of clustering in Machine Learning.
Submitted by Akashdeep Singh, on November 01, 2019

Introduction to clustering

As we have studied before about unsupervised learning. Unsupervised learning is divided into two parts. One is an association and the other is clustering. In which clustering is very popular among the two. Clustering is a process of dividing objects into groups which are consisting of similar data points. Now let's take an example of items arranged in a mall. So, their similar items are grouped so, one can not find the items mixed with other items. (i.e. onion can not be present in fruits category).

Now, the next question arises in the mind, is that, where it is used. So, it is used in Amazon's recommendation system in which it shows all the recommended products based on the past purchased product. Another use is in Netflix, which recommends movies and shows based on the watch history.

Clustering is also used in business as image segmentation, grouping webpages, and information retrieval. For example, in a retail business, clustering helps to analyze customer shopping behavior, sales campaigns, and customer attention.

Types of Clustering

There are three types of clustering,

  1. Exclusive clustering: In exclusive clustering, all the data points exclusively belong to one cluster only. It means there will not be any similarity between the data point of one cluster to the data point of another cluster. K-means clustering is an example of exclusive clustering.
  2. Overlapping clustering: In the overlapping clustering, data points belong to multiple clusters. it means there will be some similarity between the data point of one cluster to another cluster. C-means clustering is an example of overlapping clustering.
  3. Hierarchical clustering: In hierarchical clustering, there are different clusters present but they are distinct from each other while their data points are having similarity among all the clusters. Hierarchical clustering is further divided into two parts. These are Agglomerative and Divisive.
    • In Agglomerative, initially, each data point is considered as an individual cluster. In the next process, one of the similar clusters merges with another cluster and this process is continued until one separate cluster formed. In this clustering, the process following the bottom-up approach.
    • Divisive is just opposite to the Agglomerative clustering. Here, all the data points are grouped and further separated until each data points become individual. Here, the Divisive process following the top-down approach.

Cool! but, we're still left with the important part of clustering and that is K-means clustering. K-means is a clustering algorithm whose main goal is to group similar elements or data points into a cluster Here in K-means, K represents the no. of the cluster formed. The "cluster center" is the arithmetic mean of all the points belonging to the cluster. Each point is closer to its cluster center than to other cluster centers.

Advantages of K-means clustering

  • Easy to implement
  • Relatively fast and efficient
  • Only has one parameter to tune and you can easily see the direct impact of adjusting the value of parameter K


Comments and Discussions



Languages: » C » C++ » C++ STL » Java » Data Structure » C#.Net » Android » Kotlin » SQL
Web Technologies: » PHP » Python » JavaScript » CSS » Ajax » Node.js » Web programming/HTML
Solved programs: » C » C++ » DS » Java » C#
Aptitude que. & ans.: » C » C++ » Java » DBMS
Interview que. & ans.: » C » Embedded C » Java » SEO » HR
CS Subjects: » CS Basics » O.S. » Networks » DBMS » Embedded Systems » Cloud Computing
» Machine learning » CS Organizations » Linux » DOS
More: » Articles » Puzzles » News/Updates

© some rights reserved.