Clustering in computer science, is an
important issue that can be handled both in the field of data mining because it
can obtain meaningful patterns from the data and in the field of machine
learning because it is a learning method (unsupervised learning). For this
reason, a lot of research has been done about clustering. In the field of
machine learning, classification is known as supervised learning, clustering is
also known as unsupervised learning technique. Because while group labels are
known when classifying, group labels are not known in clustering, and finding
class tags is the task of the clustering algorithm. Therefore, clustering is
more difficult than classification. There are many definitions of cluster and
clustering in the literature (Everitt, 1980).
·
A
cluster is a collection of elements in which the elements in the same group are
similar and in which the elements in different groups are not similar.
·
Clusters
are groups in which the distance between two different elements in the same
group is smaller than the distance between two elements in two different
groups.
·
A
cluster is a state of high-density points separated from lower-density points
in a d-dimensional attribute space.
The purpose of clustering is to
divide the finite, unlabeled data set into finite labeled natural groups (Baraldi
and Alpaydin, 2002; Vladimir S et al., 2007).
Clustering, as mentioned earlier, is
a learning method and nowadays is used in many areas ranges from manufacturing
to artificial intelligent and from network security to surveillance system.
Maybe you have a computer and work as a server on the internet. There are a lot
of node to connect this server. The majority of these nodes also can be innocent
so they have normal tcp or udp connection to the server. However there can be
some malicious nodes that want to attack and corrupt the server in some ways
like DDOS and man-in-the middle attack. So, clustering is a way to learn which
node is innocent and which one is malicious node.
Hiç yorum yok:
Yorum Gönder