Theory of Computing
Title : A Constant-Factor Approximation Algorithm for Co-clustering
Authors : Aris Anagnostopoulos, Anirban Dasgupta, and Ravi Kumar
Volume : 8
Number : 26
Pages : 597-622
URL : http://www.theoryofcomputing.org/articles/v008a026
Abstract
Co-clustering is the simultaneous partitioning of the rows and columns
of a matrix such that the blocks induced by the row/column partitions
are good clusters. Motivated by several applications in text mining,
market-basket analysis, and bioinformatics, this problem has attracted
a lot of attention in the past few years. Unfortunately, to date, most
of the algorithmic work on this problem has been heuristic in nature.
In this work we obtain the first approximation algorithms for the
co-clustering problem. Our algorithms are simple and provide constant-
factor approximations to the optimum. We also show that co-clustering
is NP-hard, thereby complementing our algorithmic result.
A preliminary version of this paper appeared in the
Proc. 27th ACM Symp. on Principles of Database Systems (PODS 2008).