Yonghui Chen
Alan Sprague
Kevin D. Reilly
MABAC- Matrix Based Clustering Algorithm
Proc. Int'l Conf. on Artificial Intelligence (IC-AI'04), 439-443.
Abstract
Clustering is a prominent method in the data mining field. It is a discovery
process that groups data such that intra cluster similarity is maximized and
the inter cluster similarity is minimized. Clustering has been widely used in
a variety of areas and many clustering algorithms have been developed in
response. Almost every report emphasizes differences and ignores similarities
among algorithms. This is true in general and specifically for the algorithms
of central concern in this paper: agglomerative hierarchical ones. The
principal view adopted here is that improved clustering quality can be achieved
through exploiting commonalties among methods, e.g., considerations relating
to merging clusters and criterion for it, e.g., single link merging
(SLINK, OPTICS); edge cut merging (CHAMELEON, ROCK); and criterion based
on the square of the adjacency matrix (OPTICS, ROCK). MABAC (matrix based
clustering), a proposed algorithm, introduces a goodness function based on
notions of link and inner link that in turn involve direct and indirect
similarity measures.
It provides good clustering quality for data with different shape,
density and can be modified for some applications such as web mining,
microarray data analysis and sequence alignment analysis.