Various clustering techniques are
used in the literature. These clustering techniques can be grouped under five
groups:
Partitioning Clustering
At first all elements are considered
as a single cluster, then iteratively grouped the respective elements together
in smaller chambers. In other words, it is a clustering technique that divides
a data set consisting of n elements into k pieces. Partition clustering is
usually done with the help of a objective function. The most popular partitioning
clustering techniques are k-Means (Lloyd, 1982), k-Median, k-Medoids, PAM
(Rousseeuw and Kaufman, 1990), CLARA (Rousseeuw and Kaufman, 1990) ve CLARANS
(Ng and Han, 2002).
Hierarchical Clustering
Data objects are grouped by creating
tree-like structures in hierarchical clustering. There are two different
approaches to hierarchical clustering: (i) agglomerative, (ii) divisive. In the
agglomerative method, a single object is initially selected and the neighbors
of these objects are combined with this object according to their distance from
this object. In the divisive method, all data is initially a single set, then
the set is divided into ideal small partitions iteratively. The most popular hierarchical
clustering techniques are BIRCH (Zhang et al., 1996), CURE (Guha et al., 1998),
ROCK (Guha et al., 2000), Chameleon (Karypis et al., 1999) ve CACTUS (Ganti et
al., 1999).
Density Based
Clustering
Data objects are categorized
according to core points, boundary points and noise points. Based on the
density, the elements around the core points are located in the same clusters. The
most popular density based clustering techniques are DBSCAN (Ester et al.,
1996), OPTICS (Ankerst et al., 1999), DBCLASD (Xu et al., 1998), DENCLUE (Hinneburg
et al., 1998) ve SUBCLU (Kailing et al., 2004).
Grid Based Clustering
The data set is divided into a
certain number of cells to form a grid structure and all clustering operations
are performed over this grid structure. The most popular grid based clustering
techniques are STING (Wang et al., 1997), CLIQUE (Agrawal et al., 1998), Wave
Cluster (Sheikholeslami et al., 1998), BANG (Schikuta and Erhart, 1997) ve
OptiGrid (Hinneburg and Keim, 1999).
Model Based Clustering
Data elements are combined by a
series of statistical and conceptual methods. The harmony between data and some
mathematical models is tried to be optimized. There are two different
approaches in model-based clustering: statistical approach and artificial
neural networks. The most popular grid based clustering techniques are EM
(Dempster et al., 1977), COBWEB (Fisher, 1987), CLASSIST (Gennari et al.,
1989), SOM (Kohonen, 1997) ve SLINK (Han et al., 2011).
All of the aforementioned clustering
algorithms perform batch processing, so they access data on the disk. In this
way, they have information about the whole data. They can process the data
multiple times and randomly access the data at any point in the algorithm.