Computing the partition function for cliques in a graph

We present a deterministic algorithm which, given a graph G with n vertices and an integer 1<m<n, computes in n^{O(ln m)} time the sum of weights w(S) over all m-subsets S of the set of vertices of G, where w(S)=exp{gamma t m +O(1/m)} provided exactly t{m choose 2} pairs of vertices of S span an edge of G for some 0<t<1. Here gamma>0 is an absolute constant: we can choose gamma=0.06, and if n>4m and m>10, we can choose gamma=0.18. This allows us to tell apart the graphs that do not have m-subsets of high density from the graphs that have sufficiently many m-subsets of high density, even when the probability to hit such a subset at random is exponentially small in m.


Introduction and main results
(1.1) Density of a subset in a graph. Let G = (V, E) be an undirected graph with set V of vertices and set E of edges, without loops or multiple edges. Let S ⊂ V be a subset of the set of vertices of G. We define the density σ(S) of S as the ratio of the number of the edges of G with both endpoints in S to the maximum possible number of edges spanned by vertices in S: Hence 0 ≤ σ(S) ≤ 1 for all S ⊂ V.
In particular, σ(S) = 0 if and only if S is an independent set, that is, no two vertices of S span an edge of G and σ(S) = 1 if and only if S is a clique, that is, every two vertices of S span an edge of G.
In this paper, we suggest a new approach to testing the existence of subsets S of a given size m = |S| with high density σ(S). Namely, we present a deterministic algorithm, which, given a graph G with n vertices and an integer 1 < m ≤ n computes in n O(ln m) time (within a relative error of 0.1, say) the sum where γ > 0 is an absolute constant: we can choose γ = 0.06 and if n ≥ 4m and m ≥ 10 then we can choose γ = 0.18. All implicit constants in the "O" notation are absolute. Let us fix two non-negative real numbers σ and ǫ such that σ + ǫ ≤ 1. If there are no m-subsets S with density σ or higher then If, however, the there are sufficiently many m-subsets S with density σ +ǫ or higher, that is, if the probability to hit such a subset at random is at least 2 exp{−γǫm}, then Density m (G) ≥ 2 n m exp γmσ + O 1 m and we can distinguish between these two cases in n O(ln m) time. Note that "many subsets" still allows for the probability to hit such a subset at random to be exponentially small in m.
It turns out that we can compute in n O(ln m) time an m-subset S, which is almost as dense as the average under the exponential weighting of (1.1.1), that is, an m-subset S satisfying cf. Remark 3.6. We compute (1.1.1) using partition functions. (1. 2) The partition function of cliques. Let W = (w ij ) be an n × n symmetric matrix, interpreted as a matrix of weights on the edges of the complete graph on n vertices. For an integer 1 < m ≤ n, we define a polynomial which we call the partition function of cliques (note that it is different from what is known as the partition function of independent sets, see [SS05] and [BG08]). Thus 2 if W is the adjacency matrix of a graph G with set V = {1, . . . , n} of vertices, so that then P m (W ) is the number of cliques in G of size m. Hence computing P m (W ) is at least as hard as counting cliques. Let us choose an 0 < α < 1 and modify the weight W as follows: Then P m (W ) counts all cliques of size m in the complete graph on n vertices, only that a clique containing exactly k non-edges of G is weighted down by a factor of α k . In other words, an m-subset S ⊂ V is counted with weight In this paper, we present a deterministic algorithm, which, given an n × n symmetric matrix W = (w ij ) such that computes the value of P m (W ) within relative error ǫ in n O(ln m−ln ǫ) time. Here γ > 0 is an absolute constant; we can choose γ = 0.06 and if n ≥ 4m and m ≥ 10, we can choose γ = 0.18. (1. 3) The idea of the algorithm. Let J denote the n × n matrix filled with 1s. Given an n × n symmetric matrix W , we consider the univariate function Clearly, Hence our goal is to approximate f (1) and we do it by using the Taylor polynomial expansion of f at t = 0: .
It turns out that the right hand side of (1.3.2) can be computed in n O(l) time.
We present the algorithm in Section 2. The quality of the approximation (1.3.2) depends on the location of complex zeros of P m . 3 (1.4) Lemma. Suppose that there exists a real β > 1 such that Then the right hand side of (1.3.2) approximates f (1) within an additive error of m(m − 1) 2(l + 1)β l (β − 1) .
In particular, for a fixed β > 1, to ensure an additive error of 0 < ǫ < 1, we can choose l = O (ln m − ln ǫ), which results in the algorithm for approximating P m (W ) within relative error ǫ in n O(ln m−ln ǫ) time. We prove Lemma 1.4 in Section 2.
Thus we have to identify a class of matrices W for which the number β > 1 of Lemma 1.4 exists. We prove the following result.
(1.5) Theorem. There is an absolute constant ω > 0 (one can choose ω = 0.061, and if m ≥ 10 and n ≥ 4m, one can choose ω = 0.181) such that for any positive integers 1 < m ≤ n, for any n × n symmetric complex matrix Z = (z ij ), we have We prove Theorem 1.5 in Section 3. Theorem 1.5 implies that if (1.2.1) holds, where 0 < γ < ω is an absolute constant, we can choose β = ω/γ in Lemma 1.4 and obtain an algorithm which computes P m (W ) within a given relative error ǫ in n O(ln m−ln ǫ) time. Thus we can choose γ = 0.06 and β = 61/60 and if m ≥ 10 and n ≥ 4m, we can choose γ = 0.18 and β = 181/180.
(1.6) Weighted enumeration of subsets. Given a graph G with set of vertices V = {1, . . . , n} and set E of edges, we define a weight W by where γ > 0 is an absolute constant in (1.2.1). Thus we can choose γ = 0.06 and if m ≥ 10 and n ≥ 4m, we can choose γ = 0.18. We define the functional (1.1.1) by Then we have provided exactly t m 2 pairs of vertices of S span an edge of G, that is, provided σ(S) = t. Thus we are in the situation described in Section 1.1.
(1.7) Comparison with results in the literature. The topic of this paper touches upon some very active research areas, such as finding dense subgraphs in a given graph (see, for example, [Bh12] and references therein), computational complexity of partition functions (see, for example, [BG05] and references therein) and complex zeros of partition functions (see, for example, [SS05] and references therein). Detecting dense subsets is notoriously hard. It is an NP-hard problem to approximate the size |S| of the largest clique S within a factor of |V | 1−ǫ for any fixed 0 < ǫ ≤ 1 [Hå99], [Zu07]. In [F+01], a polynomial time algorithm for approximating the highest density of an m-subset within a factor of O(n 1 3 −ǫ ) for some small ǫ > 0 was constructed. The paper [B+10] presents an algorithm of n O(1/ǫ) complexity which approximates the highest density of an m-subset within a factor of n 1 4 +ǫ and an algorithm of O n ln n complexity which approximates the density within a factor of O(n 1/4 ), the current record (note that [F+01], [Bh12] and [B+10] normalize density slightly differently than we do). It has also been established that modulo some plausible complexity assumptions, it is hard to approximate the highest density of an m-subset within a constant factor, fixed in advance, see [Bh12].
We note that while searching for a dense m-subset in a graph on n vertices, the most interesting case to consider is that of a growing m such that m = o(n), since for any fixed ǫ > 0 one can compute in polynomial time the maximum number of edges of G spanned by an m-subset S of vertices of G within an additive error of ǫn 2 [FK99] (the complexity of the algorithm is exponential in 1/ǫ). Of course, for any constant m, fixed in advance, all subsets of size m can be found by the exhaustive search in polynomial time.
The appearance of partition functions in combinatorics can be traced back to the work of Kasteleyn on the statistics of dimers (perfect matchings), see for example, Section 8.7 of [LP09]. A randomized polynomial time approximation algorithm (based on the Markov Chain Monte Carlo approach) for computing the partition function of the classical Ising model was constructed by Jerrum and Sinclair in [JS93]. A lot of work has recently been done on understanding the computational complexity of the partition function of graph homomorphisms [BG05], where a certain "dichotomy" of instances has been established: it is either #P -hard to compute exactly (in most interesting cases) or it is computable in polynomial time exactly (in few cases). Recently, an approach inspired by the "correlation decay" phenomenon 5 in statistical physics was used to construct deterministic approximation algorithms for partition functions in combinatorial problems [BG08], [B+07]. To the best of author's knowledge, the partition function P m (W ) introduced in Section 1.2 of this paper has not been studied before and the idea of our algorithm sketched in Section 1.3 is also new. On the other hand, one can view our method as inspired by the intuition from statistical physics. Essentially, the algorithm computes the partition function similar to the one used in the "simulated annealing" method, see [Az88], for the temperature that is provably higher than the phase transition threshold (the role of the "temperature" is played by 1/γm, where γ is the constant in (1.1.1)).
The algorithm uses the fact that the partition function has nice analytic properties at temperatures above the phase transition temperature. The corollary asserting that graphs without dense m-subsets can be efficiently separated from graphs with many dense m-subsets is vaguely similar in spirit to the property testing approach [G+98].
The result requiring most work is Theorem 1.5 bounding the complex roots of the partition function away from the matrix of all 1s. Studying complex roots of partition functions was initiated by Lee and Yang [LY52] and it is a classical topic by now, because of its importance to locating the phase transition, see, for example, [SS05] and Section 8.5 of [LP09].
(1.8) Open question. Is it true that for any γ > 0, fixed in advance, the density functional (1.1.1) can be computed (within a relative error of 0.1, say) in n O(ln m) time?

The algorithm
(2.1) The algorithm for approximating the partition function. Given an n × n symmetric matrix W = (w ij ), we present an algorithm that computes the right hand side of (1.3.2) for the function f (t) defined by (1.3.1). Let Therefore, for k ≥ 1 we have (we agree that the 0-th derivative of g is g). 6 We note that g(0) = n m . If we compute the values of for k = 1, . . . , l, then the formulas (2.1.2) for k = 1, . . . , l provide a non-degenerate triangular system of linear equations that allows us to compute for k = 1, . . . , l.
Hence our goal is to compute the values (2.1.3).
We have where the inner sum is taken over all ordered sets I of k pairs {i 1 , j 1 }, . . . , {i k , j k } of points from S.
where the sum is taken all ordered sets I of k pairs {i 1 , j 1 }, . . . , {i k , j k } of points from the set {1, . . . , n}. The algorithm consists in computing the latter sum. Since the sum contains not more than n 2k = n O(l) terms, the complexity of the algorithm is indeed n O(l) , as claimed. 7 (2.2) Proof of Lemma 1.4. The function g(z) defined by (2.1.1) is a polynomial in z of degree d ≤ m 2 with g(0) = n m = 0, so we factor where α 1 , . . . , α d are the roots of g(z). By the condition of Lemma 1.4, we have |α i | ≥ β > 1 for i = 1, . . . , d. Therefore, where we choose the branch of ln g(z) that is real at z = 0. Using the standard Taylor expansion, we obtain Therefore, from (2.2.1) we obtain .
It remains to notice that In words: P Ω (Z) is the restriction of the sum defining the clique partition function to the subsets S that contain a given set Ω. In particular, and if |Ω| < m then We note that the natural action of the symmetric group S n on {1, . . . , n} induces the natural action of S n on the space of polynomials in Z which permutes the polynomials P Ω (Z).
First, we establish a simple geometric lemma.
(3.2) Lemma. Let u 1 , . . . , u n ∈ R d be non-zero vectors such that for some 0 ≤ α < π/2 the angle between any two vectors u i and u j does not exceed α. Let u = u 1 + . . . + u n . Then Proof. We have We prove Theorem 1.5 by the reverse induction on |Ω|, using Lemma 3.2 and the following two lemmas.
Then P Ω 2 (A) = P Ω 1 (B) 10 and Since |a ij − b ij | ≤ 2δ for all A, B ∈ U(δ), by (3.3.1) we conclude that the angle between the complex numbers P Ω 1 (A) and P Ω 2 (A) does not exceed θ and that the ratio of their absolute values does not exceed λ.
From (3.4.4), By (3.1.1), we have and hence by Lemma 3.2, we have Comparing this with (3.4.5), we conclude as above that It remains to compare |P Ω i (Z)| and |P Ω (Z)|. Since the ratio of any two P Ω j 1 , P Ω j 2 does not exceed λ, from (3.4.3) we conclude that The proof now follows by (3.4.6).
(3.6) Remark. As follows from (3.5.1), we have P Ω (Z) = 0 for any symmetric complex matrix Z = (z ij ) satisfying the conditions of Theorem 1.5. The algorithm of Section 2 extends in a straightforward way to computing P Ω (W ) for every matrix W = (w ij ) satisfying (1.2.1). This allows us to compute in n O(ln m) time an msubset S which satisfies (1.1.2), by conditioning successively on i ∈ S for i = 1, . . . , n. 15