Grothendieck inequalities for semidefinite programs with rank constraint

Grothendieck inequalities are fundamental inequalities which are frequently used in many areas of mathematics and computer science. They can be interpreted as upper bounds for the integrality gap between two optimization problems: a difficult semidefinite program with rank-1 constraint and its easy semidefinite relaxation where the rank constrained is dropped. For instance, the integrality gap of the Goemans-Williamson approximation algorithm for MAX CUT can be seen as a Grothendieck inequality. In this paper we consider Grothendieck inequalities for ranks greater than 1 and we give two applications: approximating ground states in the n-vector model in statistical mechanics and XOR games in quantum information theory.


Introduction
Let G = (V, E) be a graph with finite vertex set V and edge set E ⊆ V 2 . Let A : V × V → R be a symmetric matrix whose rows and columns are indexed by the vertex set of G, and r be a positive integer. The graphical Grothendieck problem with rank-r constraint is the following optimization problem: where S r−1 = { x ∈ R r : x · x = 1 } is the (r − 1)-dimensional unit sphere. The rank-r Grothendieck constant of the graph G is the smallest constant K(r, G) so that for all symmetric matrices A : V × V → R the following inequality holds: (1) SDP ∞ (G, A) ≤ K(r, G) SDP r (G, A).
Here S ∞ denotes the unit sphere of the Hilbert space l 2 (R) of square summable sequences, which contains R n as the subspace of the first n components. It is easy to see that K(r, G) ≥ 1. In this paper, we prove new upper bounds for K(r, G).
1.1. Some history. Inequality (1) is called a Grothendieck inequality because it first appeared in the work [22] of Grothendieck on the metric theory of tensor products. More precisely, Grothendieck considered the case r = 1 for 2-chromatic (bipartite) graphs, although in quite a different language. (A k-chromatic graph is a graph whose chromatic number is k, i.e., one can color its vertices with k colors so that adjacent vertices get different colors, but k − 1 colors do not suffice for this.) Grothendieck proved that in this case K(1, G) is upper bounded by a constant that is independent of the size of G.
Later, Lindenstrauss and Pe lczyński [33] reformulated Grothendieck's inequality for bipartite graphs in a way that is very close to the formulation we gave above. The graphical Grothendieck problem with rank-1 constraint was introduced by Alon, Makarychev, Makarychev, and Naor [2]. Haagerup [23] considered the complex case of Grothendieck's inequality; his upper bound is also valid for the real case r = 2. The higher rank case for bipartite graphs was introduced by Briët, Buhrman, and Toner [11].
1.2. Computational perspective. There has been a recent surge of interest in Grothendieck inequalities by the computer science community. The problem SDP r (G, A) is a semidefinite maximization problem with rank-r constraint: where R V ×V 0 is the set of matrices X : V × V → R that are positive semidefinite. On the one hand, SDP r (G, A) is generally a difficult computational problem. For instance, if r = 1 and G is the complete bipartite graph K n,n on 2n nodes, and if A is the Laplacian matrix of a graph G ′ on n nodes, then computing SDP 1 (K n,n , A) is equivalent to computing the weight of a maximum cut of G ′ . The maximum cut problem (MAX CUT) is one of Karp's 21 NP-complete problems. On the other hand, if we relax the rank-r constraint, then we deal with SDP ∞ (G, A), which is an easy computational problem: Obviously, one has SDP ∞ (G, A) = SDP |V | (G, A) and computing SDP |V | (G, A) amounts to solving a semidefinite programming problem (see e.g. Vandenberghe, Boyd [47]). Therefore one may approximate it to any fixed precision in polynomial time by using the ellipsoid method or interior point algorithms.
In many cases the optimal constant K(r, G) is not known and so one is interested in finding upper bounds for K(r, G). Usually, proving an upper bound amounts to giving a randomized polynomial-time approximation algorithm for SDP r (G, A). In the case of the MAX CUT problem, Goemans and Williamson [21] pioneered an approach based on randomized rounding: One rounds an optimal solution of SDP ∞ (G, A) to a feasible solution of SDP r (G, A). The expected value of the rounded solution is then related to the one of the original solution, and this gives an upper bound for K(r, G). Using this basic idea, Goemans and Williamson [21] showed that for all symmetric matrices A : V × V → R which have the properties A(u, v) ≤ 0 for u distinct from v and u∈V A(u, v) = 0 for all v ∈ V , we have SDP ∞ (K n,n , A) ≤ (0.878 . . . ) −1 SDP 1 (K n,n , A).

1.3.
Applications and references. Grothendieck's inequality is a fundamental inequality in the theory of Banach spaces. Many books on the geometry of Banach spaces contain a substantial treatment of the result. We refer for instance to the books by Pisier [40], Jameson [24], and Garling [20].
During the last years, especially after Alon and Naor [3] pointed out the connection between the inequality and approximation algorithms using semidefinite programs, Grothendieck's inequality has also become a unifying and fundamental tool outside of functional analysis.
Before we present our results we consider the application to statistical mechanics: The n-vector model, introduced by Stanley [45], describes the interaction of particles in a spin glass with ferromagnetic and antiferromagnetic interactions. The case n = 1 corresponds to the Ising model, the case n = 2 to the XY model, the case n = 3 to the Heisenberg model, and the case n = ∞ to the Berlin-Kac spherical model.
Let G = (V, E) be the interaction graph where the vertices are particles and where edges indicate which particles interact. The potential function A : V ×V → R is 0 if u and v are not adjacent, it is positive if there is ferromagnetic interaction between u and v, and it is negative if there is antiferromagnetic interaction. The particles possess a vector-valued spin f : V → S n−1 . In the absence of an external field, the total energy of the system is given by the Hamiltonian The ground state of this model is a configuration of spins f : V → S n−1 which minimizes the total energy. Finding the ground state is the same as solving SDP n (G, A). Typically, the interaction graph has small chromatic number, e.g. the most common case is when G is a finite subgraph of the integer lattice Z n where the vertices are the lattice points and where two vertices are connected if their Euclidean distance is one. These graphs are bipartite since they can be partitioned into even and odd vertices, corresponding to the parity of the sum of the coordinates.
We briefly describe the relation to quantum information theory. In an influential paper, Einstein, Podolsky, and Rosen [18] pointed out an anomaly of quantum mechanics that allows spatially separated parties to establish peculiar correlations by each performing measurements on a private quantum system: entanglement. Later, Bell [8] proved that local measurements on a pair of spatially separated, entangled quantum systems, can give rise to joint probability distributions of measurement outcomes that violate certain inequalities (now called Bell inequalities), satisfied by any classical distribution. Experimental results of Aspect, Grangier, and Roger [6] give strong evidence that nature indeed allows distant physical systems to be correlated in such non-classical ways.
XOR games, first formalized by Cleve, Høyer, Toner, and Watrous [15], constitute the simplest model in which entanglement can be studied quantitatively. In an XOR game, two players, Alice and Bob, receive questions u and v (resp.) that are picked by a referee according to some probability distribution π(u, v) known to everybody in advance. Without sharing their questions, the players have to answer the referee with bits a and b (resp.), and win the game if and only if the exclusive-OR of their answers a ⊕ b equals the value of a Boolean function g(u, v); the function g is also known in advance to all three parties.
In a quantum-mechanical setting, the players determine their answers by performing measurements on their shares of a pair of entangled quantum systems. A state of a pair of d-dimensional quantum systems is a trace-1 positive semidefinite operator ρ ∈ C d 2 ×d 2 0 . The systems are entangled if ρ is not a convex combination of tensor products of d-by-d positive semidefinite matrices. For each question u, Alice has a two-outcome measurement defined by a pair of d-by-d positive semi- When the players perform their measurements, the probability that they obtain bits a and b is given by Tr(A a u ⊗ B b v ρ). The case d = 1 corresponds to a classical setting. In this case, the maximum winning probability equals 1 + SDP 1 (G, A) /2, where G is the complete bipartite graph with Alice and Bob's questions on opposite sides of the partition, and A(u, v) = (−1) g(u,v) π(u, v)/2 for pairs {u, v} ∈ E and A(u, v) = 0 everywhere else.
Tsirel'son [46] related the maximum winning probability ω * d (π, g) of the game (π, g), when the players are restricted to measurements on d-dimensional quantum systems, to the quantity SDP r (G, A). In particular, he proved that The quantity SDP r (G, A) thus gives bounds on the maximum winning probability of XOR games when players are limited in the amount of entanglement they are allowed to use. The rank-r Grothendieck constant K(r, G) of the bipartite graph G described above gives a quantitative bound on the advantage that unbounded entanglement gives over finite entanglement in XOR games.

1.4.
Our results and methods. The purpose of this paper is to prove explicit upper bounds for K(r, G). We are especially interested in the case of small r and graphs with small chromatic number, although our methods are not restricted to this. The proof of the following theorem gives a randomized polynomial-time approximation algorithm for approximating ground states in the Heisenberg model in the lattice Z 3 with approximation ratio 0.78 . . . = (1.28 . . .) −1 . This result can be regarded as one of the main contributions of this paper. The bound for the original Grothendieck constant K(1, G) for bipartite G is due to Krivine [31]. For more than thirty years this was the best known upper bound, and it was conjectured by many to be optimal. However, shortly after our work appeared in preprint form, Braverman, Makarychev, Makarychev and Naor [10] showed that Krivine's bound can be slightly improved. The best known lower bound is 1.676956 . . . due to Davie [16] and Reeds [42] (see also Khot and O'Donnell [28]). The bound for K(2, G) is due to Haagerup [23].
When the graph G has large chromatic number, then the result of Alon, Makarychev, Makarychev, and Naor [2] gives the best known bounds for K(1, G): They prove a logarithmic dependence on the chromatic number of the graph (actually on the theta number of the complement of G, cf. Section 4) whereas our methods only give a linear dependence. Although our main focus is on small chromatic numbers, for completeness we extend the results of [2] for large chromatic numbers to r ≥ 2 in Section 6. In a previous paper [13] we proved that K(r, K n,n ) = 1 + Θ(1/r).
For the proof of Theorem 1.1 we use the framework of Krivine and Haagerup which we explain below. Our main technical contributions are a matrix version of Grothendieck's identity (Lemma 2.1) and a method to construct new unit vectors which can also deal with nonbipartite graphs (Lemma 4.1).
The strategy of Haagerup and Krivine is based on the following embedding lemma: be a graph and choose Z = (Z ij ) ∈ R r×|V | at random so that each entry is distributed independently according to the normal distribution with mean 0 and variance 1, that is, for some constant β(r, G) depending only on r and G.
In the statement above we are vague regarding the constant β(r, G). We will give the precise statement of the lemma in Section 4 (cf. Lemma 4.1 there), right now this precise statement is not relevant to our discussion. Now, the strategy of Haagerup and Krivine amounts to analyzing the following four-step procedure that yields a randomized polynomial-time approximation algorithm for SDP r (G, A): Algorithm A. Takes as input a finite graph G = (V, E) with at least one edge and a symmetric matrix A : V × V → R, and returns a feasible solution h : V → S r−1 of SDP r (G, A).
(3) Choose Z = (Z ij ) ∈ R r×|V | at random so that every matrix entry Z ij is distributed independently according to the standard normal distribution with mean 0 and variance 1, that is, To analyze this procedure, we compute the expected value of the feasible solution h. Using Lemma 1.2 we obtain and so we have K(r, G) ≤ β(r, G) −1 .
If we were to skip step (2) and apply step (4) to f directly, then the expectation E[h(u) · h(v)] would be a non-linear function of f (u) · f (v), which would make it difficult to assess the quality of the feasible solution h. The purpose of step (2) is to linearize this expectation, which allows us to estimate the quality of h in terms of a linear function of SDP r (G, A).
The constant β(r, G) in Lemma 1.2 is defined in terms of the Taylor expansion of the inverse of the function E r : where x, y ∈ S ∞ and Z = (Z ij ) ∈ R r×∞ is chosen so that its entries are independently distributed according to the normal distribution with mean 0 and variance 1.
The function E r is well-defined since the expectation above is invariant under orthogonal transformations. The Taylor expansion of E r is computed in Section 2. The Taylor expansion of E −1 r is treated in Section 3, where we basically follow Haagerup [23]. A precise version of Lemma 1.2 is stated and proved in Section 4, following Krivine [31].
Finally, in Section 5 we show that one can refine this analysis and can (strictly) improve the upper bound if one takes the dimension of the matrix A : V × V → R into account. In particular, we compare the problems SDP q and SDP r for q ≥ r. Earlier, Avidor and Zwick [7] considered the problem of bounding the ratio SDP q (G, A)/ SDP 1 (G, A) for q = 2, 3 and A the Laplacian matrix of a graph.

A matrix version of Grothendieck's identity
In the analysis of many approximation algorithms that use semidefinite programming the following identity plays a central role: Let u, v be unit (column) vectors in R n and let Z ∈ R 1×n be a random (row) vector whose entries are distributed independently according to the standard normal distribution with mean 0 and variance 1. Then, For instance, the celebrated algorithm of Goemans and Williamson [21] for approximating the MAX CUT problem is based on this. The identity is called Grothendieck's identity since it appeared for the first time in Grothendieck's work on the metric theory of tensor products [22,Proposition 4,p. 63] (see also Diestel, Fourie, and Swart [17]).
In this section we extend Grothendieck's identity from vectors to matrices by replacing the arcsine function by a hypergeometric function.
Lemma 2.1. Let u, v be unit vectors in R n and let Z ∈ R r×n be a random matrix whose entries are distributed independently according to the standard normal distribution with mean 0 and variance 1. Then, Here, Before proving the lemma we review special cases known in the literature. If r = 1, then we get the original Grothendieck's identity: The case r = 2 is due to Haagerup [23]: where K and E are the complete elliptic integrals of the first and second kind. Note that on page 201 of Haagerup [23] π/2 has to be π/4. Briët, Oliveira, and Vallentin [12] computed the first coefficient 2/r(Γ((r + 1)/2)/Γ(r/2)) 2 of the Taylor series of the expectation for every r.
The following elegant proof of Grothendieck's identity has become a classic: We have sign(Zu) sign(Zv) = 1 if and only if the vectors u and v lie on the same side of the hyperplane orthogonal to the vector Z ∈ R 1×n . Now we project this ndimensional situation to the plane spanned by u and v. Then the projected random hyperplane becomes a random line. This random line is distributed according to the uniform probability measure on the unit circle because Z is normally distributed. Now one obtains the final result by measuring intervals on the unit circle: The probability that u and v lie on the same side of the line is 1 − arccos(u · v)/π. However, we do not have such a picture proof for our matrix version. Our proof is based on the rotational invariance of the normal distribution and integration with respect to spherical coordinates together with some identities for hypergeometric functions. A similar calculation was done by König and Tomczak-Jaegermann [30]. It would be interesting to find a more geometrical proof of the lemma.
For computing the first coefficient of the Taylor series in [12] we took a slightly different route: We integrated using the Wishart distribution of 2 × 2-matrices.
Proof of Lemma 2.1. Let Z i ∈ R n be the i-th row of the matrix Z, with i = 1, . . . r. We define vectors so that we have x·y = (Zu)·(Zv). Since the probability distribution of the vectors Z i is invariant under orthogonal transformations we may assume that u = (1, 0, . . . , 0) and v = (t, √ 1 − t 2 , 0, . . . , 0) and so the pair (x, y) ∈ R r ×R r is distributed according to the probability density function (see e.g. Muirhead [37, p. 10, eq. (7)]) Hence, By using spherical coordinates x = αξ, y = βη, where α, β ∈ [0, ∞) and ξ, η ∈ S r−1 , we rewrite the above integral as If r = 1, we get for the inner double integral Now we consider the case when r ≥ 2. Since the inner double integral over the spheres only depends on the inner product p = ξ · η it can be rewritten as Integration by parts yields The last integral can be rewritten using the modified Bessel function of the first kind (cf. Andrews, Askey, Roy [4, p. 235, Exercise 9]) One can write I r/2 as a hypergeometric function (cf. Andrews, Askey, and Roy [4, (4.12. 2)]) Putting things together, we get Notice that the last formula also holds for r = 1. So we can continue without case distinction. Now we evaluate the outer double integral

Convergence radius
To construct the new vectors in the third step of the algorithm that are used to linearize the expectation we will make use of the Taylor series expansion of the inverse of E r . Locally around zero we can expand the function E −1 but in the proof of Lemma 4.1 it will be essential that this expansion be valid for all t ∈ [−1, 1]. In the case r = 1 we have E −1 1 (t) = sin(π/2t) and here the convergence radius is even infinity. The case r = 2 was treated by Haagerup and it requires quite some technical work which we sketch very briefly now. He shows that |b k | ≤ C/k 2 for some constant C, independent of k, using tools from complex analysis. Using Cauchy's integral formula and after doing some simplifications [23, p. 208] one can express b k as where C ′ α is the quarter circle { αe iθ : θ ∈ [0, π/2] }. For an appropriate choice of α the first integral is in absolute value bounded above by C/k and the second integral is in absolute value exponentially small in k. We refer to the original paper for the details. One key point in the arguments is the following integral representation of E 2 giving an analytic continuation of E 2 on the complex plane slit along the half line (1, ∞): Here, the term arcsin(z sin θ) gives the main contribution in the estimates. Now we derive a similar representation of E r and using it in Haagerup's analysis with obvious changes shows that also for r > 2 we have b k ≤ C/k 2 for some constant C, independent of k.  gives the result.

Constructing new vectors
In this section we use the Taylor expansion of the inverse of the function E r to give a precise statement and proof of Lemma 1.2; this is done in Lemma 4.1. For this we follow Krivine [31], who proved the statement of the lemma in the case of bipartite graphs. We comment on how his ideas are related to our construction, which can also deal with nonbipartite graphs, after we prove the lemma.
For the nonbipartite case we need to use the theta number, which is a graph parameter introduced by Lovász [34]. Let G = (V, E) be a graph. The theta number of the complement of G, denoted by ϑ(G), is the optimal value of the following semidefinite program: It is known that the theta number of the complement of G provides a lower bound for the chromatic number of G. This can be easily seen as follows. Any proper k-coloring of G defines a mapping of V to the vertices of a (k − 1)-dimensional regular simplex whose vertices lie in a sphere of radius √ k − 1: Vertices in the graph having the same color are sent to the same vertex in the regular simplex and vertices of different colors are sent to different vertices in the regular simplex. The Gram matrix of these vectors gives a feasible solution of (3).

The constant β(r, G) is defined as the solution of the equation
With this lemma, we can give a proof of Theorem 1.1.
Proof of Theorem 1.1. We combine Lemma 4.1 with the analysis of Algorithm A from Section 1.4. To compute the table in the theorem, we use the formula where a i are the Taylor coefficients of E r (cf. Morse and Feshbach [36, (4.5.13)]).

Now we give a proof of the lemma.
Proof of Lemma 4.1. We construct the vectors g(u) ∈ S |V |−1 by constructing vectors R(u) in an infinite-dimensional Hilbert space whose inner product matrix coincides with the one of the g(u). We do this in three steps.
In the first step, set H = R |V | and consider the Hilbert space For a unit vector x ∈ H, consider the vectors S(x), T (x) ∈ H given componentwise by Then for vectors x, y ∈ S |V |−1 we have S(x) · T (y) = E −1 r (β(r, G)x · y) and moreover In the second step, let λ = ϑ(G), and Z be an optimal solution of (3). We have λ ≥ 2 since G has at least one edge. Set and consider the matrix By applying a Hadamard transformation 1 √ 2 one sees that U is positive semidefinite, since both A + B and A − B are positive semidefinite. Define s : V → R 2|V | and t : V → R 2|V | so that The matrix U is the Gram matrix of the vectors s(u) u∈V and t(v) v∈V . It follows that these maps have the following properties: In the third step we combine the previous two. We define the vectors For adjacent vertices u, v ∈ V we have and moreover the R(u) are unit vectors. Hence, one can use the Cholesky decomposition of (R(u) · R(v)) ∈ R V ×V 0 to define the desired function g : V → S |V |−1 .
We conclude this section with a few remarks on the lemma and its proof: (1) To approximate the Gram matrix (R(u) · R(v)) it is enough to compute the series expansion of E −1 r and the matrix U to the desired precision. The latter is found by solving a semidefinite program.
Hence, β(1, G) = 2 arcsinh(1)/π = 2 ln(1 + √ 2)/π. (3) In the second step one can also work with any feasible solution of the semidefinite program (3). For instance one can replace ϑ(G) in the lemma by the chromatic number χ(G) albeit getting a potentially weaker bound. (4) Alon, Makarychev, Makarychev, and Naor [2] also gave an upper bound for K(1, G) using the theta number of the complement of G. They prove that which is much better than our result in the case of large ϑ(G). However, our bound is favourable when ϑ(G) is small. In Section 6 we generalize the methods of Alon, Makarychev, Makarychev, and Naor [2] to obtain better upper bounds on K(r, G) for r ≥ 2 and large ϑ(G). (5) Finally, notice that in the first step it was essential that the Taylor expansion of E −1 r has convergence radius of at least one.

A refined, dimension-dependent analysis
So far we only compared the two problems SDP ∞ and SDP r . One can perform a refined, dimension-dependent analysis by comparing SDP q and SDP r when q ≥ r.
Let K(q → r, G), where q ≥ r, be the least number such that In this section we give an upper bound for K(q → r, G) that depends on q and r. For fixed r, this upper bound will become smaller as q comes closer to r. Krivine [31] gave such a refined, dimension-dependent analysis in the bipartite case; he showed that K(2 → 1, K n,n ) = √ 2, K(3 → 1, K n,n ) ≤ 1.517, and K(4 → 1, K n,n ) ≤ π/2.
Avidor and Zwick [7] analyzed the cases r = 1 and q ∈ {2, 3} for bipartite G = K n,n and A the Laplacian of a graph G ′ on n nodes. Our upper bound comes from the following lemma: there is a function g : V → S |V |−1 such that whenever u and v are adjacent, then where 0 < β(q → r, G) ≤ 1 is such that β(q → r, G) > β(q + 1 → r, G) and β(q → r, G) > β(r, G) for all q ≥ 2.
The proof of the lemma will also give a procedure to compute β(q → r, G) explicitly. So we have the theorem: Theorem 5.2. Let G = (V, E) be a graph with at least one edge and let q ≥ r ≥ 1 be integers. Then K(q → r, G) ≤ β(q → r, G) −1 .
Proof. Combine Lemma 5.1 with Algorithm A from Section 1.4.
The proof of the lemma uses some basic facts from harmonic analysis, which we now summarize. For measurable functions f , g : [−1, 1] → R we consider the inner product We say that a continuous function f : [−1, 1] → R is of positive type for S n−1 if for any choice x 1 , . . . , x N of points in S n−1 we have that the matrix f (x i · x j ) N i,j=1 is positive semidefinite. If two continuous functions f , g : [−1, 1] → R are of positive type for S n−1 , then f, g n ≥ 0.
Schoenberg [44] characterized the continuous functions of positive type in terms of Gegenbauer polynomials. We denote by P n k the Gegenbauer polynomial of degree k and parameter (n − 2)/2 which is normalized so that P n k (1) = 1. Notice that this normalization is not the one commonly found in the literature.
The Gegenbauer polynomials P n 0 , P n 1 , P n 2 , . . . are pairwise orthogonal with respect to the inner product (5), and they form a complete orthogonal system for the space L 2 ([−1, 1]), equipped with the inner product (5).
Schoenberg's characterization of the functions of positive type is as follows: A function f : [−1, 1] → R is continuous and of positive type for S n−1 if and only if (6) f for some nonnegative numbers a 0 , a 1 , a 2 , . . . such that ∞ k=0 a k converges, in which case the series in (6) converges absolutely and uniformly in [−1, 1].
A continuous function f : [−1, 1] → R can also be of positive type for spheres of every dimension. Schoenberg [44] also characterized these functions. They are the ones that can be decomposed as a k t k for some nonnegative numbers a 0 , a 1 , a 2 , . . . such that ∞ k=0 a k converges. A polynomial in R[x 1 , . . . , x n ] is harmonic if it is homogeneous and vanishes under the Laplace operator ∆ = ∂ 2 /∂x 2 1 + · · · + ∂ 2 /∂x 2 n . Harmonic polynomials restricted to the unit sphere S n−1 are related to Gegenbauer polynomials by the addition formula (see e.g. Andrews, Askey, and Roy [4, Theorem 9.6.3]): Let H k be the space of degree k harmonic polynomials on n variables. Any orthonormal basis of H k can be scaled so as to give a basis e k,1 , . . . , e k,h k of H k for which the following holds: For every u, v ∈ S n−1 we have that With this we have all that we need to prove the lemma. We only consider the bipartite case in the proof in order to simplify the notation and to make the argument more transparent. One can handle the nonbipartite case exactly in the same way as in the proof of Lemma 4.1.
Proof of Lemma 5.1. As before, we construct the function g : V → S |V |−1 from functions S and T that satisfy S(x) · T (y) = E −1 r (βx · y) for some real number β. Fix 0 < β ≤ 1 and consider the expansion which converges in the L 2 sense.
for nonnegative numbers g q k such that ∞ k=0 g q k converges. Now notice that where P q k q = P q k , P q k 1/2 q . Above, since t l is a function of positive type for every sphere, we have that t l , P q k q ≥ 0. But we also have that and we see that |g q k (β)| ≤ g q k for all k ≥ 0 and 0 < β ≤ 1. This proves the claim.
From the proof of the claim, it it also clear that is a continuous function of β. Now, let β(q → r, G) be the maximum number β ∈ (0, 1] that is such that ∞ k=0 |g q k (β)| = 1.
By the intermediate value theorem, such a number exists because (7) is continuous as a function of β, being equal to 0 when β = 0 and at least E −1 r (1) = 1 when β = 1. Consider the Hilbert space equipped with the Euclidean inner product and where h k is the dimension of H k , the space of harmonic polynomials of degree k on q variables. For a vector x ∈ S q−1 , consider the vectors S(x) and T (x) ∈ H given componentwise by S(x) k = |g q k (β(q → r, G))|(e k,1 (x), . . . , e k,h k (x)) and T (x) k = sign(g q k (β(q → r, G))) |g q k (β(q → r, G))|(e k,1 (x), . . . , e k,h k (x)). By the addition formula we have that ). Moreover, we also have that and so from the Gram matrix of the vectors S(f (u)) and T (f (v)) we may obtain the function g : V → S |V |−1 as we wanted. Now we show that β(q → r, G) > β(q + 1 → r, G) for all q ≥ 2. To this end, consider the function Since the series ∞ k=0 |g q+1 k (β)| converges, from Schoenberg's theorem we see that F β is a continuous function of positive type for the sphere S q . Notice moreover that, by definition, β(q + 1 → r, G) is the maximum number β ∈ (0, 1] such that F β (1) = 1.
Since F β is of positive type for S q , it is also of positive type for S q−1 , and then we may write and we have ∞ k=0 a k (β) = 1 as for all k, P q k (1) = 1. We also have the expression Notice that, since both P q+1 l and P q k are of positive type for S q−1 , we have that P q+1 l , P q k q ≥ 0 for all l and k. Now, from the expansion we see that The function E −1 r is not of positive type because the coefficient b 3 of its Taylor expansion is always negative (this can easily be checked using Eq. (4)), and so some of the g q+1 l (β) must be negative. This, together with (8) and (9), implies that |g q k (β)| < a k (β) for all 0 < β ≤ 1. So we must have that ∞ k=0 |g q k (β(q + 1 → r, G))| < ∞ k=0 a k (β(q + 1 → r, G)) = 1, and we see that β(q → r, G) > β(q + 1 → r, G). In a similar way, one may show that β(q → r, G) > β(r, G).

Better bounds for large chromatic numbers
For graphs with large chromatic number, or more precisely with large ϑ(G), our bounds on K(r, G) proved above can be improved using the techniques of Alon, Makarychev, Makarychev, and Naor [2]. In this section, we show how their bounds on K(1, G) generalize to higher values of r. Theorem 6.1. Given a graph G = (V, E) and integer 1 ≤ r ≤ log ϑ(G), we have Proof. It suffices to show that for any matrix A : V × V → R, we have Let λ = ϑ(G), and Z : V × V → R be an optimal solution of (3). Let J be the 2|V | × 2|V | all-ones matrix and I the 2 × 2 identity matrix. Since the matrix (I ⊗ Z + J)/λ is positive semidefinite, we obtain from its Gram decomposition functions s, t : V → R 2|V | that satisfy (1) s(u) · s(u) = t(u) · t(u) = 1 for all u ∈ V .
Let H be the Hilbert space of vector-valued functions h : R r×|V | → R r with inner product where the expectation is taken over random r × |V | matrices Z whose entries are i.i.d. N (0, 1/r) random variables. Let R ≥ 2 be some real number to be set later. Define for every u ∈ V the function g u ∈ H by otherwise.
Notice that for every matrix Z ∈ R r×|V | , the vector g u (Z) ∈ R r has Euclidean norm at most 1. It follows by linearity of expectation that SDP r (G, A) ≥ E
We proceed by lower bounding the right-hand side of the above inequality. Based on the definition of g u we define two functions h 0 u , h 1 u ∈ H by h 0 u (Z) = Zf (u) R + g u (Z) and h 1 u (Z) = Zf (u) R − g u (Z).
For every u ∈ V , define the function H u ∈ R 2|V | ⊗ H by We expand the inner products (g u , g v ) in terms of f (u) · f (v) and H u , H v .
Claim 2. For every {u, v} ∈ E we have Proof: Simply expanding the inner product H u , H v gives It follows from property 3 of s and t that the above terms involving s(u) · s(v) and t(u) · t(v) vanish. By property 2, the remaining terms reduce to Expanding the first expectation gives and expanding the second gives Adding these two gives that the last two terms cancel. Since E[Z T Z] = I, what remains equals 1 R 2 f (u) · f (v) − (g u , g v ), which proves the claim.
From the above claim it follows that where H u 2 = H u , H u . By the triangle inequality, we have for every u ∈ V , By the definition of g u , the vectors Zf (u) and g u are parallel. Moreover, they are equal if Zf (u) ≤ R. Since f (u) is a unit vector, the r entries of the random vector Zf (u) are i.i.d. N (0, 1/r) random variables. Hence, whereω r is the unique rotationally invariant measure on S r−1 , normalized such thatω r (S r−1 ) = r r/2 /Γ(r/2). Using a substitution of variables, we get