www.theoryofcomputing.org The Communication Complexity of Gap Hamming Distance

In the gap Hamming distance problem, two parties must determine whether their respective strings x; y2f0; 1g n are at Hamming distance less than n=2 p n or greater than n=2+ p n: In a recent tour de force, Chakrabarti and Regev (2010) proved the long- conjectured W(n) lower bound on the randomized communication complexity of this problem. In follow-up work, Vidick (2010) discovered a simpler proof. We contribute a new proof, which is simpler yet and a page-and-a-half long.


Introduction
The gap Hamming distance problem features two communicating parties, the first of which receives a vector x ∈ {−1, +1} n and the second a vector y ∈ {−1, +1} n . The two vectors are chosen such that the Hamming distance between them is either noticeably smaller than n/2 or noticeably larger than n/2. The objective is to reliably determine which is the case by exchanging as few bits of communication as possible. Throughout this paper, communication is assumed to be randomized, and the communication protocol is to produce the correct answer with probability at least 2/3. Formally, gap Hamming distance is the

Some Terminology
Let f : X × Y → {−1, +1} be a given communication problem. A common starting point in proving lower bounds on randomized communication complexity is Yao's minimax theorem [20]: to rule out a randomized protocol for f with cost c and error probability at most ε, one defines a probability distribution µ on X ×Y and argues that with respect to µ, every deterministic protocol with cost c errs on more than an ε fraction of the inputs. This approach is complete in that one can always prove a tight lower bound on randomized communication in this manner.
The challenge in Yao's program is establishing the hardness of a given distribution µ for deterministic communication protocols. By far the most common solution since the 1980s is the corruption method, pioneered by Yao himself [20]. In more detail, a deterministic communication protocol with cost c gives a partition X ×Y = 2 c i=1 R i , where each set R i is the Cartesian product of a subset of X and a subset of Y . The sets R i are called rectangles, and the output of the deterministic protocol is constant on each rectangle. To prove a lower bound on communication, one defines a probability measure µ on X ×Y and argues that every rectangle R with nontrivial measure is ε-corrupted by elements of f −1 (+1), in the sense that for some constant ε > 0. Provided that f −1 (−1) has reasonable measure, (1.2) bounds from below the total number of rectangles in any partition of X ×Y . By symmetry, the roles of +1 and −1 can be interchanged throughout this argument. Furthermore, the argument applies unchanged to partial functions f , whose domain is a proper subset of X ×Y . Over the years, many approaches have been used to prove (1.2). For the uniform distribution, a particularly general method was discovered by Babai, Frankl, and Simon [2] almost thirty years ago. It plays a key role in several subsequent papers and this work. In detail, let µ be the uniform measure on X ×Y . For the sake of contradiction, suppose that R = A × B is a large rectangle that is not ε-corrupt, A ⊆ X, B ⊆ Y . We may assume that no row of R is 2ε-corrupt because any offending rows can be discarded without affecting the size of R much. The proof is completed in two steps. STEP 1: IDENTIFYING A HARD CORE.
Using the hypothesis that A is large, one identifies elements x 1 , x 2 , . . . , x k ∈ A that are "very dissimilar" and collectively "representative" of X. Naturally, what those words mean depends on the context but one gets the right idea by thinking about k random elements of X. Typically k is tiny, exponentially smaller than |A|. We will call {x 1 , x 2 , . . . , x k } a hard core of A because at an intuitive level, these few elements capture the full complexity of A.
STEP 2: CORRUPTION. Using the hypothesis that B is large and x 1 , x 2 , . . . , x k are representative, one shows that the rectangle This program is successful in practice because it is much easier to analyze the corruption of a rectangle {x 1 , x 2 , . . . , x k } × B for a small and highly structured collection of elements x 1 , x 2 , . . . , x k . Babai, Frankl, and Simon [2] used this approach to establish, with an exceedingly elegant and short proof, an Ω( √ n) lower bound on the communication complexity of set disjointness. In that work, X and Y both referred to the family of subsets of {1, 2, . . . , n} of cardinality √ n, and the hard core used in Step 1 was a collection of k = ε √ n subsets that are mostly disjoint.

Previous Work
We are now in a position to outline the proofs of Theorem 1.1 due to Chakrabarti and Regev [7] and Vidick [17]. Both works study a continuous version of gap Hamming distance, in which the parties receive inputs x, y ∈ R n drawn according to Gaussian measure and need to determine whether their inner product is less than − √ n or greater than √ n. It was shown earlier [5] that the discrete and continuous versions of gap Hamming distance are equivalent from the standpoint of communication complexity. The proof in [7] has two steps. Let R = A × B be a rectangle of nonnegligible Gaussian measure. For reasons of measure, we may assume that the vectors in A, B have Euclidean norm √ n, up to a factor 1 ± ε.

STEP 1: IDENTIFYING A HARD CORE.
Using the hypothesis that A has nontrivial measure, one identifies Ω(n) vectors x 1 , x 2 , . . . , x i , . . . ∈ A that are almost orthogonal. More precisely, the Euclidean norm of the projection of x i onto span{x 1 , . . . , x i−1 } is at most a small constant fraction of the norm of x i .
In retrospect, a system of near-orthogonal vectors is a natural choice for a hard core because GHD is defined in terms of inner products. That such a system of vectors can always be chosen from A was proven by Raz [13], who used this fact to obtain a lower bound for another linear-algebraic communication problem (deciding subspace membership). In the light of the program of Babai, Frankl, and Simon [2], it is tempting to proceed to Step 2 and argue that the rectangle {x 1 , x 2 , . . . , x i , . . . , } × B is heavily corrupted. Unfortunately, gap Hamming distance does have rectangles that are large and almost uncorrupted, and one cannot apply the corruption method directly. Instead, Chakrabarti and Regev [7] prove the following.
The two steps above immediately give the following statement: for any sets A, B ⊆ R n of nonnegligible measure, random vectors x ∈ A, y ∈ B obey | x, y | = Ω( √ n) with constant probability. This anticoncentration result is the technical centerpiece of Chakrabarti and Regev's proof. The authors actually derive a much stronger statement, giving a detailed characterization of the distribution of x, y . To complete the proof, they use a criterion for high communication complexity due to Jain and Klauck [9], known as the smooth rectangle bound. Specifically, Chakrabarti and Regev use their anticoncentration result to argue that in any partition of R n × R n , only a small constant measure of inputs can be covered by large uncorrupted rectangles. Settling this claim requires the introduction of a second measure, call it λ , to account for covering by large rectangles. The smooth rectangle bound [9] was discovered very recently and overcomes limitations of Yao's corruption bound-at the expense of being more challenging to use.
In follow-up work, Vidick [17] discovered a simpler proof of the anticoncentration property for x, y , by taking a matrix-analytic view of the problem as opposed to the purely measure-and informationtheoretic treatment in [7]. Vidick first shows that for any A ⊆ R n of nonnegligible Gaussian measure, the matrix M = E x∈A [xx T ] has a relatively spread-out spectrum, with a constant fraction of singular values on the order of Ω(1). Since E x∈A,y∈B [ x, y 2 ] = E y∈B [y T My], the author of [17] is able to use this spectral property of M to prove anticoncentration for x, y . Vidick's proof ingeniously exploits the rotationinvariance of Gaussian measure and requires just the Bernstein inequality and the Berry-Esseen theorem for independent Gaussian variables. With anticoncentration established, Vidick uses the Jain-Klauck criterion to prove the lower bound on communication complexity. Figure 1: Reduction from gap orthogonality to gap Hamming distance (T = "true," F = "false").

Our Proof
This paper contributes a new proof of Theorem 1.1. Our approach departs from previous work on two counts. First, we use Yao's original corruption method for proving communication lower bounds, rather than the recent and more involved criterion of Jain and Klauck. Second, the authors of [7,17] work with an extension of the problem to Gaussian space, whereas we are able to give a direct argument for the hypercube. As we show, the discrete setting allows for a treatment that is much simpler both in formalism and in substance; contrast the proof of Lemma 4.4 in [13] with that of Lemma 3.1 in this paper to get an idea.
Our main technical tool is Talagrand's concentration inequality [15,1,16]. It states that for any given subset S ⊂ {−1, +1} n of constant measure, nearly all the points of the hypercube lie at a short Euclidean distance from the convex hull of S. Talagrand's concentration inequality has yielded results whose range and depth are out of proportion to the inequality's easy proof [1,16]. We use the following well-known consequence of Talagrand's inequality: the projection of a random vector x ∈ {−1, +1} n onto a given linear subspace V ⊆ R n has Euclidean norm √ dimV ± O(1) almost surely. We now give a more detailed description of the proof. What we actually obtain is an Ω(n) lower bound on the communication complexity of gap orthogonality, a problem in which the two parties receive vectors x, y ∈ {−1, +1} n and need to reliably tell whether they are nearly orthogonal or far from orthogonal. Formally, gap orthogonality is the partial Boolean function on {−1, +1} n × {−1, +1} n given by Gap orthogonality readily reduces to gap Hamming distance, as suggested pictorially in Figure 1. Hence, it suffices to prove an Ω(n) lower bound for gap orthogonality. It seems at first that nothing of substance is gained by switching from gap Hamming distance to gap orthogonality. In actuality, the latter is preferable in that it allows the use of Yao's corruption method. Indeed, the corruption property for ORT n is equivalent to the anticoncentration of | x, y |. Thus, we just need to establish the anticoncentration: for some absolute constant ε > 0 and any sets A, B ⊆ {−1, +1} n of uniform measure at least 2 −εn , the inner product x, y for random x ∈ A, y ∈ B cannot be too concentrated around zero. We give a short proof of this result, which combines selected ideas of [7] and [17] with some new elements.

STEP 1: IDENTIFYING A HARD CORE.
One can select a family of Ω(n) near-orthogonal vectors x 1 , x 2 , . . . , x i , . . . ∈ A. Formally, the projection of x i onto span{x 1 , x 2 , . . . , x i−1 } has Euclidean norm no greater than a third of the norm of x i .

STEP 2: ANTICONCENTRATION & CORRUPTION.
Fix the vectors so constructed. Then with probability exponentially close to 1, a random y ∈ {−1, +1} n will have nonnegligible inner product (absolute value at least √ n/4) with one or more of the vectors x i .
Step 1 is a trivial consequence of Talagrand's concentration inequality; earlier works by Raz [13] and Chakrabarti and Regev [7] used an analogue of this claim for the sphere S n−1 with the uniform measure, whose proof was more involved. To prove Step 2, we switch to the matrix-analytic view of Vidick [17] but give a simpler and more direct argument. Specifically, we consider the matrix M with rows x 1 , x 2 , . . . , x i , . . . , which by construction is close in norm to an orthogonal matrix. It follows that a constant fraction of the singular values of M are large, on the order of Ω( √ n). Applying Talagrand a second time, we get that a random vector y ∈ {−1, +1} n will have a constant fraction of its Euclidean norm in the linear subspace corresponding to the large singular values of M, except with probability exponentially small. This completes Step 2. The desired anticoncentration property falls out as a corollary, for purely combinatorial reasons. This proves corruption for gap orthogonality.

Notation
The symbol [k] stands for the set {1, 2, . . . , k}. The inner product of x, y ∈ R n is denoted x, y = ∑ x i y i . Likewise, we let A, B = ∑ A i j B i j for matrices A = [A i j ] and B = [B i j ]. The Boolean values "true" and "false" are represented in this paper by −1 and +1, respectively. In particular, Boolean functions take on values ±1. A partial function on X is a function whose domain of definition, denoted dom f , is a proper subset of X. For a Boolean string x, the symbol x k stands for the concatenation xx . . . x (k times).

Linear algebra
The Frobenius norm of a real matrix M = [M i j ] is given by M F = (∑ M 2 i j ) 1/2 . We denote the singular values of M by σ 1 (M) ≥ σ 2 (M) ≥ · · · ≥ 0. It is straightforward to bound the inner product of matrices A, B in terms of their singular values. Specifically, fixing a singular value decomposition A = ∑ σ i (A)u i v T i for some unit vectors u i , v i , one arrives at the following well-known bound: (2.1) Very precise methods are known for estimating the rth singular value of a matrix, including the Hoffman-Wielandt inequality. For us, the following crude bound is all that is needed.
The Euclidean norm of a vector x ∈ R n is denoted x = (∑ x 2 i ) 1/2 . The dimension of a linear subspace V is denoted dimV . For a linear subspace V ⊆ R n and a vector x ∈ R n , we let proj V x denote the orthogonal projection of x to V . The following fact is immediate from Talagrand's concentration inequality [1, Thm. 7.6.1].
Fact 2.2 (Talagrand). For every linear subspace V ⊆ R n and every t > 0, one has where c > 0 is an absolute constant.
Fact 2.2 is entirely classical. The interested reader will find its short derivation in [16] and in Section 5 of this paper.

Communication complexity
Fix finite sets X,Y and let f be a (possibly partial) Boolean function on X ×Y . A randomized communication protocol is said to compute f with error ε if for all (x, y) ∈ dom f , the output of the protocol on (x, y) is f (x, y) with probability at least 1 − ε. The least communication cost of such a protocol is known as the ε-error communication complexity of f , denoted R ε ( f ). For all constants ε ∈ (0, 1/2), one has R ε ( f ) = Θ(R 1/3 ( f )).
A rectangle of X ×Y is any set of the form A × B, where A ⊆ X, B ⊆ Y . One of the earliest and best known criteria for high randomized communication complexity is Yao's corruption bound [20,2,12].

Theorem 2.3 (Corruption bound
). Let f be a (possibly partial) Boolean function on X × Y . Given ε, δ ∈ (0, 1), suppose that there is a distribution µ on X ×Y such that for every rectangle R ⊆ X ×Y with µ(R) > δ . Then We gave an informal proof of Theorem 2.3 in the Introduction. See [3, Lem. 3.5] for a rigorous treatment.

Corruption of Gap Orthogonality
We start by showing that any subset of {−1, +1} n of nontrivial size contains n/10 near-orthogonal vectors.
Next, we show that given any family of near-orthogonal vectors in {−1, +1} n , a random vector in {−1, +1} n almost surely has a substantial inner product with some vector from the family. The proof uses the lower bound in Talagrand's theorem, as opposed to Lemma 3.1 in which we used the upper bound.
The main result of this section is immediate from the previous two lemmas for basic combinatorial reasons; cf. the well-known proof of an Ω( √ n) lower bound on the communication complexity of disjointness due to Babai, Frankl, and Simon [2]. Proof. Assume that |A| > 2 · 2 (1−α)n , where α > 0 is the constant from Lemma 3.1. We will show that a random y ∈ {−1, +1} n occurs in B with probability at most exp(−Ω(n)). The argument is a combinatorial accounting for what kinds of elements arise in B and is closely analogous to Theorem 8.3 in [2].
3 |B|. Then 2 −n |B | is a lower bound on the probability that a random y ∈ {−1, +1} n has | y, x i | ≤ √ n/4 for at least (1 − 3ε)k indices i. By Lemma 3.2 and the union bound, this probability cannot exceed k 3εk e −Ω(k) ≤ e −Ω(n) .

Main Result
In this final section, we prove the sought Ω(n) lower bound on the communication complexity of gap Hamming distance and gap orthogonality, defined in (1.1) and (1.3).
Main Theorem.
Proof. Immediate from the reduction in Figure 1. Formally, for n a square,

More on Talagrand's Concentration Inequality
In this concluding section, we provide additional background on Talagrand's concentration inequality for the interested reader and include the well-known derivation of Fact 2.2. For a nonempty subset S ⊆ R n and a point x ∈ R n , let ρ(x, S) = inf y∈S x − y be the Euclidean distance from x to S. Talagrand's inequality [15,1] states that for any reasonably large subset S of the hypercube, almost all the points of the hypercube lie at a short distance from the convex hull of S. In more detail: Theorem 5.1 (Talagrand). For a fixed convex set S ⊆ R n and a random x ∈ {−1, +1} n , P[x ∈ S] P[ρ(x, S) > t] ≤ e −t 2 /16 .
We now explain how Fact 2.2 is a consequence of Talagrand's concentration inequality. For any linear subspace V ⊆ R n and vector x ∈ R n , elementary linear algebra gives proj V x = ρ(x,V ⊥ ), where V ⊥ is the orthogonal complement of V with dimV ⊥ = n − dimV . This allows one to reformulate Fact 2.2 in the language of Euclidean distances: for a fixed linear subspace V ⊆ R n and a random x ∈ {−1, +1} n , The advantage of this reformulation is that it can be easily proved using Talagrand's concentration inequality. We present here the proof from a recent expository article by Tao [16].