Book chapter
Pairwise Data Clustering and Applications
Computing and Combinatorics, pp.455-466
Lecture Notes in Computer Science, Springer Berlin Heidelberg
06/24/2003
DOI: 10.1007/3-540-45071-8_46
Abstract
Data clustering is an important theoretical topic and a sharp tool for various applications. Its main objective is to partition a given data set into clusters such that the data within the same cluster are “more” similar to each other with respect to certain measures. In this paper, we study the pairwise data clustering problem with pairwise similarity/ dissimilarity measures that need not satisfy the triangle inequality. By using a criterion, called the minimum normalized cut, we model the pairwise data clustering problem as a graph partition problem. The graph partition problem based on minimizing the normalized cut is known to be NP-hard. We present a ((4 + o(1)) ln n)-approximation polynomial time algorithm for the minimum normalized cut problem. We also give a more efficient algorithm for this problem by sacrificing the approximation ratio slightly. Further, our scheme achieves a ((2 + o(1)) ln n)-approximation polynomial time algorithm for computing the sparsest cuts in edge-weighted and vertex-weighted undirected graphs, improving the previously best known approximation ratio by a constant factor.
Details
- Title: Subtitle
- Pairwise Data Clustering and Applications
- Creators
- Xiaodong WuDanny Z ChenJames J MasonSteven R Schmid
- Resource Type
- Book chapter
- Publication Details
- Computing and Combinatorics, pp.455-466
- Publisher
- Springer Berlin Heidelberg; Berlin, Heidelberg
- Series
- Lecture Notes in Computer Science
- DOI
- 10.1007/3-540-45071-8_46
- ISSN
- 0302-9743
- Language
- English
- Date published
- 06/24/2003
- Academic Unit
- Electrical and Computer Engineering; Radiation Oncology
- Record Identifier
- 9984047737402771
Metrics
13 Record Views