Subsets of Cayley graphs that induce many edges

Let $G$ be a regular graph of degree $d$ and let $A\subset V(G)$. Say that $A$ is $\eta$-closed if the average degree of the subgraph induced by $A$ is at least $\eta d$. This says that if we choose a random vertex $x\in A$ and a random neighbour $y$ of $x$, then the probability that $y\in A$ is at least $\eta$. The work of this paper was motivated by an attempt to obtain a qualitative description of closed subsets of the Cayley graph $\Gamma$ whose vertex set is $\mathbb F_2^{n_1}\otimes \dots \otimes \mathbb F_2^{n_d}$ with two vertices joined by an edge if their difference is of the form $u_1\otimes \cdots \otimes u_d$. For the matrix case (that is, when $d=2$), such a description was obtained by Khot, Minzer and Safra, a breakthrough that completed the proof of the 2-to-2 conjecture. In this paper, we formulate a conjecture for higher dimensions, and prove it in an important special case. Also, we identify a statement about $\eta$-closed sets in Cayley graphs on arbitrary finite Abelian groups that implies the conjecture and can be considered as a"highly asymmetric Balog-Szemer\'edi-Gowers theorem"when it holds. We conclude the paper by showing that this statement is not true for an arbitrary Cayley graph. It remains to decide whether the statement can be proved for the Cayley graph $\Gamma$.


Introduction
The Unique Games Conjecture, formulated by Khot [5] in 2002, is a central conjecture in theoretical computer science. If true, it implies that for a wide class of natural problems it is NPhard to find even a very crude approximate solution in polynomial time. Recently, a weakening of the conjecture known as the 2-to-2 Games Conjecture, where the approximation is required to be less crude (so it is easier to prove hardness) was proved by Khot, Minzer and Safra [6], a result that is considered as a major step towards the Unique Games Conjecture itself. More precisely, after work by various authors, the problem had been reduced to a statement about a certain Cayley graph, and Khot, Minzer and Safra proved that statement.
The Cayley graph Γ in question has as its vertex set the set of all m × n matrices over F 2 , with two vertices joined by an edge if their difference has rank 1. Let us say that a subset A ⊂ M m,n (F 2 ) is η-closed if the probability that A + B ∈ A, when A is chosen uniformly from A and B is chosen uniformly from all rank-1 matrices, is at least η. In graph terms, this is the probability that a random neighbour of a random point in A is itself in A.
A simple example of an η-closed set is the set {A ∈ M m,n (F 2 ) : Ax = y}, for some pair of vectors x ∈ F n 2 , y ∈ F m 2 . Indeed, if Ax = y and B is a random matrix of rank 1, then x ∈ ker B with probability roughly 1/2. But if x ∈ ker B, then (A + B)x = y, so A + B ∈ A as well. A very similar, but distinct, example is the set {A ∈ M m,n (F 2 ) : A T x = y}. Let us call sets of one of these two kinds basic sets.
We  In other words, every closed set is dense inside some intersection of a small number of basic sets.
It is well known and not hard to see that this in fact leads to a characterization (at least qualitatively) of closed sets. Indeed, observe first that if A is η-closed, then the subgraph induced by A has average degree at least η|B|, where B is the set of rank-1 matrices, and maximal degree at most |B|. Therefore, any subset of A of size at least (1 − η/4)|A| has average degree at least η|B|/2. It follows from this observation and Theorem 1.1 that we can find disjoint subsets A 1 , . . . , A r of A, subsets C 1 , . . . , C r of M m,n (F 2 ), a positive real number δ = δ(η) and a positive integer k = k(η) with the following properties.
(1) The sets A i are disjoint.
(2) Each C i is an intersection of at most k basic sets.
Conversely, if such sets exist, then the probability that a random matrix A ∈ A belongs to some A i is at least η/4. If it belongs to A i , then we can use the following lemma. We write u ⊗ v for the rank-1 matrix M with M i j = u i v j , which sends a vector x to the vector x, v u. Note also that (u ⊗ v) T sends x to x, u v. Lemma 1.2. Let C be an intersection of at most k basic sets and let A ⊂ C be a subset of relative density at least δ. Then A is 2 −k (δ − 2 −(m−k) )-closed.
Proof. Let us set C(x, y) = {A ∈ M m,n (F 2 ) : Ax = y}, and C ′ (x, y) = {A ∈ M m,n (F 2 ) : A T x = y}. Let x 1 , . . . , x k , y 1 , . . . , y k be non-zero vectors such that C = r i=1 C(x i , y i ) ∩ k i=r+1 C ′ (x i , y i ). Let u⊗v be a rank-1 matrix. If there exists i ≤ r such that x i , v 0, then (A+u⊗v)(x i ) = y i +u, so A + u ⊗ v C i and hence A + u ⊗ v C. Similarly, if there exists i > r such that x i , u 0, then (A + u ⊗ v) T (x i ) = y i + v and again A + u ⊗ v C.
We shall now bound from below the probability that A + u ⊗ v ∈ A given that A ∈ A and that x i , v = 0 for every i ≤ r and x i , u = 0 for every r < i ≤ k, noting that the condition on u ⊗ v states that (u, v) ∈ U × V for a pair of subspaces U and V with codimensions that add up to at most k, a condition that occurs with probability 2 −k .
Let us now condition further on the choice of v ∈ V. That means that we fix v, choose a random u ∈ U, and add u ⊗ v to A. If we allow u to take the value 0, then the resulting matrix is uniformly distributed in the affine subspace A + U ⊗ v, so the probability that it is in A is equal to the relative density of A inside this affine subspace.
The translates of U ⊗ v by matrices in C partition C. Let us write them as W 1 , . . . , W s , and let the relative density of A inside W i be δ i . Then, still fixing v, we have that This statement is true regardless of v, so we deduce that the probability that A + u ⊗ v ∈ A given that A ∈ A and (u, v) ∈ U × (V \ {0}) is at least δ. If we now insist that u 0, we reduce this probability by at most 2 −(m−k) , so the result is proved.
Let B ∈ B be chosen uniformly at random. Given the lemma above, applied to the sets A i and C i , we deduce that the conditional probability that A + B ∈ A i given that A ∈ A i is at least c(δ, k), and from that it follows that A is c(δ, k)η/4-closed.
Thus, a set A is η-closed for some not too small η if and only if an appreciable fraction of A is efficiently covered by disjoint intersections of few basic sets. Barak, Kothari and Steurer suggest in [1] that establishing a higher dimensional analogue of Theorem 1.1 may be a useful step in obtaining a proof of the full Unique Games Conjecture, though they do not actually provide a formal reduction. The main purpose of this paper is to formulate a suitable conjecture and prove some partial results towards it. We say that A ⊂ F n 1 2 ⊗ · · · ⊗ F n d 2 is η-closed if with probability at least η, we have A + u 1 ⊗ · · · ⊗ u d ∈ A, when A ∈ A and vectors u i ∈ F n i 2 \ {0} are chosen independently and uniformly at random.
To see that this is indeed a generalization of the problem about matrices considered above, we identify M m,n (F 2 ) with F m 2 ⊗ F n 2 in the usual way, which leads to a slight reformulation of Theorem 1.1 in terms of tensor products. Note first that under this identification, the set {M ∈ M m,n (F 2 ) : It follows that an intersection of at most k basic sets is either empty or a translate of H ⊗ K for some pair of subspaces H ⊂ F m 2 , K ⊂ F n 2 with codim(H) + codim(K) ≤ k. In the higher-dimensional case, there is a richer class of sets A ⊂ F n 1 2 ⊗ · · · ⊗ F n d 2 that are η-closed. To describe them, we introduce the following piece of notation, which we shall use repeatedly in the rest of the paper. Given a non-empty subset I ⊂ [d], write F I 2 for i∈I F n i 2 , so that we naturally have F n 1 , which is a subspace of F n 1 2 ⊗ · · · ⊗ F n d 2 . It is not hard to see that this subspace contains at least a proportion c(d, k) > 0 of all rank-1 tensors u 1 ⊗ · · · ⊗ u d (provided that n 1 , . . . , n d are sufficiently large), so it is c(d, k)-closed. It follows that any translate of it is c(d, k)-closed too.
We now make the following conjecture. The main result of this paper, stated later in this section, is a proof of Conjecture 1.4 in an important special case.

1.1.
What can be said about more general Cayley graphs? It is tempting to try to prove Conjecture 1.4 by identifying and proving a statement that applies to a much wider class of Cayley graphs, of which Conjecture 1.4 would be a special case. We would begin with an Abelian (or even non-Abelian) group G and a pair of subsets A, B ⊂ G, where we think of B as the set of generators, satisfying the hypothesis that |{(a, b) ∈ A × B : a + b ∈ A}| ≥ η|A||B|. We shall say in this situation that A is (B, η)-closed (in G).
Another way of writing the condition is where α is the density of A, µ B is the characteristic measure of B (that is, the function that takes the value |G|/|B| on B and 0 elsewhere) and we define f * g(x) to be E y+z=x f (y)g(z). By the Cauchy-Schwarz inequality the left-hand side is at most where inner products and L p norms are defined using expectations, so our hypothesis implies that It is easy to see that this "mixed energy" ½ A * µ B 2 2 can be at most α, with equality if and only if a At this point let us recall the so-called asymmetric Balog-Szemerédi-Gowers theorem, which can be found in [8] as Theorem 2.35. (For a useful alternative presentation of the theorem, see also [3].) The main assumption of the theorem is that A, B are two finite subsets of an Abelian group, with densities α and β, such that ½ A * ½ B 2 2 ≥ ηαβ 2 (which is equivalent to saying that ½ A * µ B 2 2 ≥ ηα), but there is also an assumption that A is not too much bigger than B. The precise statement is as follows.
Theorem 1.5. For every ǫ > 0 there exists a constant C = C(ǫ) with the following property. Let G be a finite Abelian group, let L ≥ 1, let 0 < η ≤ 1 and let A and B be finite subsets of G with densities α and β, such that α ≤ Lβ and ½ A * ½ B 2 2 ≥ 2ηαβ 2 . Then there exist a subset H ⊂ G such that |H + H| ≤ Cη −C L ǫ |H|, a subset X ⊂ G of size at most Cη −C L ǫ |A|/|H| such that More qualitatively speaking, if A is not too much larger than B and ½ A * ½ B 2 2 is within a constant of its largest possible value, then there is a set H of small doubling such that a small number of translates of H cover a substantial proportion of A, and some translate of H covers a substantial proportion of B. It is not hard to see that the converse holds as well.
This theorem cannot be used to prove Conjecture 1.4 because of the condition that α ≤ Lβ, which does not apply here since the set A in Conjecture 1.4 can be much bigger than the set B. That raises the following question, which generalizes Problem 1.3. Question 1.6. Let G be a finite Abelian group, let η > 0, and let A, B ⊂ G be subsets such that A is (B, η)-closed in G. What can be said about A, B and the relationship between them?
A similar question can of course be asked with the slightly weaker hypothesis that ½ A * ½ B 2 2 ≥ η 2 αβ 2 , but we shall concentrate on the question as stated, since it is more closely related to Conjecture 1.4.
An immediate observation is that we cannot hope to say anything interesting about the structure of B, even if η = 1. For example, η = 1 if A = G and B is an arbitrary subset of G. For a more general example, one can let A be an arbitrary union of cosets of some subgroup H and let B be an arbitrary subset of H. For a slightly different example, let G = F n 2 , let B be the set {e 1 , . . . , e n } of standard basis vectors, and let A be a union of n/3-dimensional affine subspaces V i , such that each V i is a random translate of the subspace generated by n/3 randomly chosen e j . Then if x ∈ V i and b ∈ B, the probability that Any general statement will have to be weak enough to allow for examples like these. The last example shows that we cannot hope to find a single set H of small doubling and cover a large portion of A efficiently with translates of H, unless H is of constant size, in which case the conclusion becomes trivial. To sketch briefly why not, observe first that by Freiman's theorem we can assume that H is a subspace. Next, note that for each vector x, the probability that it belongs to the span of a random n/3 standard basis vectors is exponentially small in the size of the support of x. We can also use the following simple lemma. Proof. Let u 1 , . . . , u d be a basis for d. By Gaussian elimination, we can convert u 1 , . . . , u d into a basis v 1 , . . . , v d and find coordinates t 1 , . . . , t d such that v i (t j ) = δ i j . Then the support size of i λ i v i is at least the number of non-zero λ i , which proves the result.
When d is large, it follows that the proportion of vectors in H of small support is very small. Combining these observations, one can show that for every η there exists d such that if H is a d-dimensional subspace, then the probability that a random subspace V of dimension n/3 is (H, η/2)-closed is at most η/2. This in turn can be used to prove that with high probability the set A described above (for a suitable number of V i ) is not (H, η)-closed for any H of dimension d or above.
However, these examples do not rule out a weakening along the following lines. An argument similar to the one we mentioned just after the statement of Theorem 1.1 shows that if the answer is yes, then we can find a collection of disjoint subsets A 1 , . . . , A m that cover a substantial proportion of A, each one with small doubling and each one (B, η ′ )-closed (with a slightly smaller η ′ ). Thus, we would be able to obtain a conclusion similar to that of Theorem 1.5 but without the requirement that the structured sets are all translates of one another. A positive answer would also imply Conjecture 1.4. Indeed, by Freiman's theorem A i is contained in a subspace V i not much larger than A i . This reduces the conjecture to the case where A is a subspace. In that case, a very simple corollary of our main result, Corollary 1.12 (stated later) proves the conjecture.
However, the answer to Question 1.8 is easily seen to be negative (which implies that it is also negative if we assume the weaker mixed-energy hypothesis instead). The example we are about to give was communicated to us privately by Boaz Barak as a counterexample to a related but slightly different statement.
For convenience let n be odd, let A ⊂ F n 2 be the set of all vectors with (n ± 1)/2 coordinates equal to 1, and let B be the set of standard basis vectors. Then it is easy to see that A is ηclosed for η = (n + 1)/2n ≈ 1/2. Suppose now that we could find a subset A ′ ⊂ A such that |A ′ + A ′ | ≤ C|A ′ |, and A ′ is (B, η ′ )-closed. By Freiman's theorem, A ′ is contained in a subspace V that is not much bigger than A ′ , which implies that V is c-closed for some positive constant c = c(η). That implies that at least cn of the standard basis vectors belong to V. Let W be the subspace spanned by these basis vectors. The maximum number of elements of A that can belong to a translate x + W of W is 2(cn) −1/2 |W|, and therefore |A ′ | ≤ 2(cn) −1/2 |V|. This contradicts the fact that V is not much bigger than A ′ .
In this paper we formulate a yet weaker conjecture and prove that it still implies Conjecture 1.4. Unfortunately, we also give a counterexample to the weaker conjecture. The counterexample does not make the implication vacuous, however, because the implication depends on a nontrivial theorem that is true and of some interest: it is just that for a general Cayley graph (on a finite Abelian group) one cannot deduce the hypotheses of the theorem from the assumption that a set is η-closed. It is conceivable that one might be able to prove Conjecture 1.4 (and thereby also give a different proof of the theorem of Khot, Minzer and Safra) by using additional properties of the particular Cayley graph that that conjecture is about.
How, then, might one try to find a conjecture that would not be contradicted by the "twolayers" example just discussed? One observation that suggests a possible way forward is the following. Suppose that we extend the set by adding a few more layers. If, say, we take not just the middle two layers but the middle ǫ −1 layers (or thereabouts), then we obtain a new set inside which the first set has relative density approximately 2ǫ, and this new set is (1 − 2ǫ)-closed, since a random element of the set will be in one of the interior layers with probability approximately (and in fact slightly bigger than) 1 − 2ǫ, and adding an arbitrary basis vector to such an element will give another element of the set.
So perhaps we could hope that if A is (B, η)-closed, then there is a set C that is (B, 1 −ǫ)-closed such that |A ∩ C| ≥ δ|C| for some δ that depends on η and ǫ only.
However, simple modifications of the example show that this is too much to ask. For instance, we can take as our set A the set of all x ∈ F n 2 such that m or m + 1 coordinates are equal to 1 and all but the first 2m coordinates are zero. If m is around n/4, say, then the resulting set is (B, 1/4)-closed, but there is no prospect of A living densely in a set that is almost perfectly closed, because of the need to add basis vectors corresponding to coordinates beyond 2m.
A further example to consider is the set of all x ∈ F n 2 such that at most n/3 coordinates are equal to 1. This set is (B, 1/3)-closed (at least -in fact it is more like (B, 2/3)-closed because the probability that a random element of the set has exactly ⌊n/3⌋ coordinates equal to 1 is approximately 1/2), but for similar reasons to the previous example, one cannot find an almost perfectly closed set with a significant proportion of its elements in the set.
However, the picture changes if we ask for slightly less. Let us informally call a set C good if there is a proportional-sized subset B ′ ⊂ B such that C is (B ′ , 1 − ǫ)-good for some small constant ǫ. Thus, now we ask only that C should be almost closed for a large subset of B rather than for the whole set.
It is not immediately clear how to use this definition, because the statement that |A ∩ C| ≥ δ|C| for a good set C can be true for uninteresting reasons. For example, we could take C to be the union of a subspace V generated by n/5 basis vectors together with an arbitrary subset of A of cardinality 2δ|V|. To remedy this, we insist that C is "related to A" in the graph in a different sense from that of A being dense in C.
Here, then, is a question that replaces Question 1.8. Question 1.9. Is it true that for every η, ǫ > 0 there exist c > 0, δ > 0 and positive integer l with the following property? Let G be a finite Abelian group and let A, B ⊂ G be subsets such that A is (B, η)-closed. Then there is a subset B ′ ⊂ B and a non-empty subset C ⊂ G with the following properties.
in the convolution is l.
Condition (3) is saying that for any x ∈ C, the probability that x−b 1 −· · ·−b l +b l+1 +· · · +b 2l ∈ A, when the b i are chosen uniformly and independently at random from B, is at least c. When the group G is F n 2 for some n, we can and will simplify it, since B = −B. To see that this question improves on Question 1.8, let us consider the two problematic examples for that question. If m is odd and A ⊂ F n 2 consists of all sequences with (m ± 1)/2 1s and with no 1s after the mth coordinate, then let C be the set of all sequences with no 1s after the mth coordinate that have between (m − 1)/2 − ǫ −1 and (m + 1)/2 + ǫ −1 1s. If l = ǫ −1 , then for any x ∈ C, the probability that . . , e m }, then for every b ∈ B ′ and every c ∈ C that is not on the boundary (in the obvious sense), Now let us look at the example where A is the set of all sequences with at most n/3 1s. This time let C be the set of all sequences that are 0 after the first 2n/3 coordinates and have at most n/3+ǫ −1 1s, and let B ′ = {e 1 , . . . , e 2n/3 }. Then for any x ∈ C, the probability that again because adding an element of B ′ to a non-boundary element of C gives an element of C.

Our main result.
Let us now see why a positive answer to Question 1.9 would imply Conjecture 1.4. The deduction will be easy once we have established the following theorem, which is the main result of this paper. In the statement of the theorem, and in the rest of this paper, G denotes F n 1 2 ⊗ · · · ⊗ F n d 2 and B denotes the multiset {u 1 ⊗ · · · ⊗ u d : u i ∈ F n i 2 for all i} (which is a multiset only because some of the u i can be zero). Note that the notion of (B, η)-closedness can be generalized in an obvious way to multisets.
It is convenient to state the following corollary separately, which follows from Theorem 1.10 by taking θ = 1/2. Corollary 1.11. There exists ǫ = ǫ(d) > 0 such that for any δ > 0, there exists a positive integer k = k(d, δ) with the following property. For any B ′ ⊂ B with |B ′ | ≥ δ|B| and any A ⊂ G which is (B ′ , 1 − ǫ)-closed, there exists a k-simple set D ⊂ G which has |D ∩ A| ≥ 1 2 |D|. Let us see why Conjecture 1.4 follows from Corollary 1.11 and a positive answer to Question 1.9 in the case of the group G and the subset B ⊂ G of rank-1 tensors. Let η > 0. Pick ǫ = ǫ(d) so that the conclusion of Corollary 1.11 holds. If the answer to Question 1.9 is positive for G and B, then we can choose c > 0, δ > 0, and a positive integer l such that the conclusion of the question is true. Now let A ⊂ G be η-closed. This is saying that A is (B, η)-closed. By the conclusion of Question 1.9, there exist a set B ′ ⊂ B with |B ′ | ≥ δ|B| , and a non-empty subset C ⊂ G such in the convolution is l. Define B ′ to be the multiset that consists of the set B ′ together with the multiset of all u 1 ⊗ · · · ⊗ u d with u i ∈ F n i 2 for each i and with at least one u i equal to 0. Note that |B ′ | ≥ δ|B| and C is (B ′ , 1 − ǫ)-closed. By Corollary 1.11, there exists a k-simple set D ⊂ G, for some k = k(d, δ), which has |D ∩ C| ≥ 1 2 |D|. Now pick x ∈ D and b 1 , . . . , b l ∈ B uniformly and independently at random. The probability that x − b 1 − · · · − b l ∈ A is at least c/2. Therefore, there exists some y ∈ G such that when x ∈ D is randomly chosen, the probability that Another simple corollary of Theorem 1.10 is the following result, which is Conjecture 1.4 in the case where A is a subspace.
Proof. Since V is a vector space, the condition that V is η-closed says that u 1 ⊗ · · · ⊗ u d ∈ V for at least a proportion of η of all rank-1 tensors u 1 ⊗ · · · ⊗ u d . Thus, there exists some B ′ ⊂ B with |B ′ | ≥ η|B| such that V is (B ′ , 1)-closed. Taking θ sufficiently close to 0 in Theorem 1.10, it follows that V ⊃ D for a k-simple set D, where k depends only on d and η. Then D is a translate In the next section, we shall prove Theorem 1.10. In the last section, we show that the answer to Question 1.9 is negative.
2. The proof of Theorem 1.10 Note that G = F n 1 2 ⊗ · · · ⊗ F n d 2 can be viewed as the set of d-dimensional (n 1 , . . . , n d )-arrays over F 2 which in turn can be viewed as F n 1 n 2 ...n d 2 , equipped with the entry-wise dot product. The proof of Theorem 1.10 will be reasonably simple once we have established the following result. In the statement of this lemma, and in the rest of this section, we write kB to mean the set of elements of G that can be written as a sum of at most k elements of B, where B is some fixed (multi)subset of G.
Then there exists a multiset Q whose elements are chosen from f 1 (d)B ′ (but with arbitrary multiplicity) with the following property. The set of arrays r ∈ G with r.q = 0 for at least ( In order to deduce Theorem 1.10 from this lemma, we shall use Fourier analysis. Recall that if A is a subset of G of density α, then by Parseval's identity we have α = r | ½ A (r)| 2 . Also, if B is a multiset in G, then by Parseval's identity and the convolution law, . Thus, the condition that A is (B, η)-closed can be rewritten as the inequality Another fact we shall use later is that if W is a subspace of G, then µ W (r) equals E w∈W (−1) r.w , which is 1 if r belongs to the orthogonal complement of W and 0 otherwise.
Then |A bad | ≤ η 2 |A|, by hypothesis. So when a ∈ A is chosen randomly, we have that The result follows.
We are now in a position to deduce Theorem 1.10 from Lemma 2.1. In the proof, and in the rest of this section, whenever a new function g i appears, we mean that there exists a function g i with the claimed property.
Clearly, |B ′′ | ≥ 1 2 |B ′ |. Using Lemma 2.1, we can find a multiset Q with elements chosen from g 1 (d)B ′′ such that the set of arrays r ∈ G with r.q = 0 for at least ( By Markov's inequality, it follows that Choosing ǫ = ǫ(d, θ) > 0 to be at most θg 2 (d)/g 1 (d), we therefore have which in physical space is the inequality where α is the density of A. Equivalently, which tells us that if a random element of A is added to a random element of R, then the sum belongs to A with probability at least 1 − θ. The number of triples (a 1 , a 2 , r) ∈ A × A × R with a 1 + a 2 = r is therefore at least (1 − θ)|A||R|, and therefore, by averaging, there exists a ∈ A such that | It remains to prove Lemma 2.1.
2.1. The proof of Lemma 2.1 in the matrix case. For the reader's convenience, in this subsection we give the proof of Lemma 2.1 in the matrix case: that is, the case when d = 2. Accordingly, in this subsection, G will be the group F n 1 2 ⊗ F n 2 2 and B will be the multiset that consists of all rank-1 matrices u 1 ⊗ u 2 with u 1 ∈ F n 1 2 and u 2 ∈ F n 2 2 , with multiplicity. (As already remarked, the non-trivial multiplicity comes from when u 1 or u 2 is zero.) Lemma 2.3. Let B ′ ⊂ B be such that |B ′ | ≥ δ|B|. Then there exist k depending only on δ, a subspace U ⊂ F n 1 2 , and a subspace V u ⊂ F n 2 2 for each u ∈ U, such that all these subspaces have codimension at most k, and such that every u ⊗ v with u ∈ U, v ∈ V u belongs to 16B ′ . Moreover, we can insist that all V u have the same codimension.
Proof. For each u ∈ F n 1 2 , let B ′ u = {v ∈ F n 2 2 : u ⊗ v ∈ B ′ } and let T = {u ∈ F n 1 2 : |B ′ u | ≥ δ 2 2 n 2 }. By averaging, we have that |T | ≥ δ 2 2 n 1 . If u ∈ T , then B ′ u has density at least δ 2 in F n 2 2 , so by Bogolyubov's lemma (see for example Proposition 4.39 of [8] For the last assertion, note that we may replace V u with any subspace of it, and we still have We shall now prove that we may take Q = u∈U (u ⊗ V u ) in Lemma 2.1.
Lemma 2.4. There exists an absolute constant ǫ > 0 with the following property. Let U be a subspace of F n 1 2 and for each u ∈ U let V u be a subspace of F n 2 2 such that all these subspaces have codimension at most k and all the V u have the same codimension. Let Q be the multiset u∈U (u ⊗ V u ). Then the set of arrays r ∈ G with r.q = 0 for at least (1 − ǫ)|Q| choices q ∈ Q is contained in W 12 + W 1 ⊗ F n 2 2 + F n 1 2 ⊗ W 2 where W 12 ⊂ F n 1 2 ⊗ F n 2 2 , W 1 ⊂ F n 1 2 and W 2 ⊂ F n 2 2 are subspaces of dimension at most f (k).
Before we prove this lemma, we need to establish the following result. In the statement, and in the rest of this section, we use the following convention for the multiplication of arrays r ∈ F n 1 2 ⊗ · · · ⊗ F n a+b 2 and s ∈ F n 1 2 ⊗ · · · ⊗ F n a 2 . Define rs ∈ F n a+1 2 ⊗ · · · ⊗ F n a+b 2 to be the array with (rs) i a+1 ,...,i a+b = j 1 ,..., j a r j 1 ,..., j a ,i a+1 ,...,i a+b s j 1 ,..., j a . Note that in the case when r is a matrix and s is a vector, this is not quite the same as the standard convention since we sum over the first coordinate of the matrix instead of the second. If r and s are arrays of the same size, then we use the notation r.s for the product of r and s, since then it coincides with the obvious notion of dot product, and it is useful to think of it that way.
Lemma 2.5. Let U be a subspace of F n 1 2 and for each u ∈ U let V u be a subspace of F n 2 2 such that all these subspaces have codimension at most k and all the V u have the same codimension. Let Q = u∈U (u ⊗ V u ) and for m = 2 k+3 , let r 1 , . . . , r m ∈ G be such that for every i ≤ m there are at least 3 4 |Q| choices of q ∈ Q such that r i .q = 0. Then there exist i j such that (r i − r j )w = 0 for at least ρ(k)2 n 1 choices w ∈ F n 1 2 , for some positive function ρ(k). Thus, there exist i j such that r i − r j has rank at most l(k). Lemma 2.6. There exists an absolute constant ǫ ′ > 0 with the following property. Let U be a subspace of F n 1 2 and for each u ∈ U let V u be a subspace of F n 2 2 such that all these subspaces have codimension at most k and all the V u have the same codimension. Let Q be the multiset u∈U (u⊗V u ). Then the set of arrays r ∈ G of rank at most l such that r.q = 0 for at least ( Indeed, take ǫ ′ as given by Lemma 2.6 and let ǫ = min(ǫ ′ /2, 1/4). We claim that ǫ is suitable for Lemma 2.4. By Lemma 2.5, we can find x 1 , . . . , x m ∈ G , with m ≤ 2 k+3 , as follows. For every i, x i .q = 0 for at least (1 − ǫ)|Q| choices q ∈ Q, and if r.q = 0 for at least (1 − ǫ)|Q| choices q ∈ Q, then r − x i has rank at most g 1 (k) for some i ≤ m. But then (r − x i ).q = 0 for at least (1 − ǫ ′ )|Q| choices q ∈ Q, and by Lemma 2.6, r − x i is contained in W 1 ⊗ F n 2 2 + F n 1 2 ⊗ W 2 where W 1 , W 2 have codimension at most g 2 (k) and do not depend on r. So we may take W 12 to be the span of all the x i .
Proof of Lemma 2.6. We shall prove that we can take ǫ ′ to be 1/8, W 1 to be U ⊥ and W 2 to be a subspace that we define as follows. Let X = {x ∈ F n 2 2 : x ∈ V ⊥ u for at least |U| 10·2 l choices u ∈ U}. Since |V ⊥ u | ≤ 2 k for every u ∈ U, we have |X| ≤ |U|·2 k |U|/10·2 l = 10 · 2 k+l . Let W 2 = span(X). Now let r be a matrix of rank at most l such that r.q = 0 for at least 7|Q|/8 choices of q ∈ Q. Then ru ∈ V ⊥ u for at least 3|U|/4 choices of u ∈ U. Since r has rank at most l, we have r ∈ F n 1 2 ⊗ H 2 (r) for some subspace H 2 (r) ⊂ F n 2 2 with dim(H 2 (r)) ≤ l. By definition, for any v W 2 , the number of u ∈ U with v ∈ V ⊥ u is at most |U| 10·2 l . Since |H 2 (r)| ≤ 2 l and ru ∈ H 2 (r) for every u ∈ U, the number of u ∈ U with ru ∈ V ⊥ u \ W 2 is at most |U| 10 . It follows that for more than |U|/2 choices u ∈ U we have ru ∈ W 2 . This in fact implies that ru ∈ W 2 for all u ∈ U, which implies that r ∈ U ⊥ ⊗ F n 2 2 + F n 1 2 ⊗ W 2 .
2.2. The proof of Lemma 2.1 in the general case. It is convenient to introduce a few definitions.
Definition 2.7. Let k be a positive integer and let ǫ > 0. Let Q be a multiset with elements chosen from G (with arbitrary multiplicity). We say that Q is (k, α)-forcing if the set of all arrays r ∈ G with r.q = 0 for at least α|Q| choices q ∈ Q is contained in a set of the from I⊂[d],I ∅ V I ⊗ F I c 2 for some choice of subspaces V I ⊂ F I 2 of dimension at most k.
Hypothesis (H) is the one that really interests us, since if it holds for every d then Lemma 2.1 is proved. However, in order to get an induction to work we shall need a slight strengthening that says that we can ask for the elements of Q to belong to a large subspace of a suitable form. In what follows, we shall prove that if (H) holds for d, then so does (H'), and that if (H') holds for all d ′ < d, then (H) holds for d. This completes the proof since (H) holds for d = 1. Indeed, if B ′ ⊂ F n 1 2 has |B ′ | ≥ δ · 2 n 1 , then by Bogolyubov's lemma, 4B ′ = 2B ′ − 2B ′ contains a subspace U ⊂ F n 1 2 of codimension at most g(δ). If x.u = 0 for over half the elements of U, then the set of u ∈ U with x.u = 0 is not contained in a proper subspace, so x ∈ U ⊥ , which has dimension at most g(δ). This implies that U is (g(δ), 3/4)-forcing.
The next few results are needed for technical reasons. The set introduced in the next definition behaves well under certain algebraic operations, such as intersecting with a dense subspace. It is a generalization of the set u∈U (u ⊗ V u ) used in the previous subsection. Definition 2.10. Suppose that we build a collection of subspaces as follows. We begin with a subspace U ⊂ F n 1 2 . Then for each u 1 ∈ U we take a subspace U u 1 ⊂ F n 2 2 , for each u 1 ∈ U and u 2 ∈ U u 1 we take a subspace U u 1 ,u 2 ⊂ F n 2 2 , and so on up to subspaces U u 1 ,...,u d−1 . Now let Q be the multiset that consists of all tensors u 1 ⊗ · · · ⊗ u d such that u 1 ∈ U, u 2 ∈ U u 1 , . . . , u d ∈ U u 1 ,...,u d−1 . If all the subspaces in the collection have codimension at most l, then we say that Q is an l-system. Lemma 2.11. Let Q be an l-system and let Q ′ be a l ′ -system. Then Q ∩ Q ′ contains an (l + l ′ )system.
Proof. Let Q have spaces as in Definition 2.10 and let Q ′ have spaces U ′ We define an (l+l ′ )-system P contained in Q∩Q ′ as follows.
The next result generalizes Lemma 2.5. To state it, we need to find the equivalent of lowrank matrices in the higher dimensional case. The definition we use is essentially the same as that of the partition rank of a tensor (see for example [7] for a discussion of this notion). Indeed, if a tensor has partition rank at most k, then it is k-degenerate (in the sense of the next definition), and conversely if a tensor is k-degenerate, then it has partition rank is at most 2 d−1 k. The second author has shown that the partition rank is also related to the analytic rank of a tensor [4], which we do not define here (but again see [7]), with a tower-type dependence, improving on the previously known Ackermann dependence. In this section we use a very similar argument, but since we do not care about bounds, we present it in qualitative form for ease of reading and for the sake of completeness. (i) Let D ′ ⊂ D, and for each t ∈ D ′ , let U t be a subspace of F n d 2 of codimension at most k with the codimensions of the U t all equal. Let Q be the multiset t∈D ′ (t ⊗ U t ). If m = 2 k+3 , and r 1 , . . . , r m ∈ G have the property that for every i, r i .q = 0 holds for at least 3 4 |Q| choices of q ∈ Q, then there exist i j such that Proof.
(i) By averaging, for every i ≤ m there are at least |D ′ |/2 choices of t ∈ D ′ such that r i .(t ⊗ u) = 0 for more than half the u ∈ U t . But r i .(t ⊗ u) = (r i t).u, so for these t we have that r i t ∈ U ⊥ t . By further averaging, it follows that for at least |D ′ |/4 choices of t ∈ D ′ we have that r i t ∈ U ⊥ t for at least m/4 choices of i ≤ m. Since m/4 > |U ⊥ t |, for every such t, there exist i j with r i t = r j t. Thus, for some i j, there are at least |D ′ | 4m 2 choices of t ∈ D ′ with r i t = r j t, which implies the statement we wish to prove. (ii) Write r = i s i ⊗ w i where s i ∈ F n 1 2 ⊗ · · · ⊗ F n d−1 2 and {w i } is a basis for F n d 2 . Note that r(v 1 ⊗· · ·⊗v d−1 ) = 0 if and only if s i .(v 1 ⊗· · ·⊗v d−1 ) = 0 for every i. Let D ′ be the multiset consisting of all v 1 ⊗ · · · ⊗ v d−1 with s i .(v 1 ⊗ · · · ⊗ v d−1 ) = 0 for every i. By assumption, |D ′ | ≥ c|D|. Since (H) holds for d − 1, there exists a multiset Q, with elements chosen from g 1 (d)D ′ , which is (g 2 (d, c), 1 − g 3 (d))-forcing. Since s i .q = 0 for every q ∈ Q, it follows that there exist subspaces Moreover, by Bogolyubov's lemma, for every t ∈ D ′ , there exists a subspace U t ⊂ F n d 2 of codimension at most g 4 (δ) such that t ⊗ U t ⊂ 4B ′ . After passing to suitable subspaces, we may assume that all U t have the same codimension. Now let Q = t∈D ′ (t⊗ U t ). By (i), if f 3 (δ) ≥ 2 g 4 (δ)+3 , then there exist i j such that (r i − r j )t = 0 for at least g 5 (δ)|D ′ | choices t ∈ D ′ (where g 5 (δ) > 0) and therefore also for at least g 6 (δ)|D| choices t ∈ D (where g 6 (δ) > 0). By (ii), it follows that r i − r j is g 7 (d, δ)-degenerate. (iv) Choose Q as in (iii) and let r 1 , . . . , r m be a maximal set such that for all i we have r i .q = 0 for at least 3 4 |Q| choices q ∈ Q, and for all i j, r i − r j is not f 4 (d, δ)-degenerate. Then, by (iii), we have m < f 3 (δ). Let V [d] be the span of all the r i . The result follows by the maximality of {r 1 , . . . , r m }.
The next lemma is the key step to complete the proof of Lemma 2.1. In order to state it, we need to introduce a total ordering ≺ of the non-empty subsets of [d −1]. It does not matter exactly what the ordering is, but we require it to have the property that if J I then J ≺ I.
To understand the point of the next lemma, observe that we take an array r that belongs to the sum of a certain set of subspaces, some of which depend on r, and we show, using the hypothesis that r.q = 0 for almost all q ∈ Q, that it is contained in a similar sum, but with the subspace corresponding to I c no longer depending on r. By applying this lemma repeatedly, we shall remove all dependence on r from the right-hand side.

Lemma 2.17. Suppose that (H') holds for every d ′ < d. Let δ > 0 and let B ′ ⊂ B be such that
2 be a subspace of dimension at most k. Then there exists a multiset Q with elements chosen from f 1 (d)B ′ with the following property. Any array with dim(H J c (r)) ≤ k and the property that r.q = 0 for at least (1 − f 2 (d))|Q| choices q ∈ Q is contained in a subspace for some U J ⊂ F J 2 , U J c ⊂ F J c 2 not depending on r and some K J c (r) ⊂ F J c 2 possibly depending on r, all of dimension at most f 3 (d, δ, k).
Proof. After reordering if necessary, we may assume that I = [a] for some 1 ≤ a ≤ d − 1. The next claim gives us a multiset Q with certain properties. Once we have it, we shall use those properties to show that Q satisfies the conclusion of the lemma.
Claim. There exist a multiset Q ′ with elements chosen from J⊂I,J I,J ∅ (W ⊥ J ⊗ F I\J 2 ) which is (h 1 (d, δ, k), 1 − h 2 (d))-forcing (with h 2 (d) > 0), and for each s ∈ Q ′ , a multiset Q s with elements chosen from F I c 2 which is (h 3 (d, δ), 1 − h 4 (d))-forcing (with h 4 (d) > 0) such that max s∈Q ′ |Q s | ≤ 2 min s∈Q ′ |Q s | and setting Q to be the multiset {s ⊗ t : s ∈ Q ′ , t ∈ Q s } = ∪ s∈Q ′ (s ⊗ Q s ), the elements of Q belong to h 5 (d)B ′ Proof of Claim. Let C be the multiset {u 1 ⊗ · · · ⊗ u a : u i ∈ F n i 2 } and let D be the multiset {u a+1 ⊗ · · · ⊗ u d : u i ∈ F n i 2 }. For each s ∈ C, let D s = {t ∈ D : s ⊗ t ∈ B ′ }. Also, let C ′ = {s ∈ C : |D s | ≥ δ 2 |D|}. Clearly, |C ′ | ≥ δ 2 |C|. By Lemma 2.12, for every s ∈ C ′ there exists a g 1 (d, δ)-system R s in g 2 (d)D s . By (H') applied to a, there exists a multiset Q ′ with elements chosen from (g 3 which is (g 4 (d, δ, k), 1 − g 5 (d))-forcing for some g 5 (d) > 0. For every s ∈ Q ′ , choose s 1 , . . . , s l ∈ C ′ with l ≤ g 3 (d) such that s = s 1 + · · · + s l , and let P s = i≤l R s . By Lemma 2.11, P s contains a g 6 (d, δ)-system, therefore |P s | ≥ g 7 (d, δ)|D| for some g 7 (d, δ) > 0. By (H) for d − a, for every s ∈ Q ′ there exists a multiset Q s with elements chosen from g 8 (d)P s which is (g 9 (d, δ), 1 − g 10 (d))-forcing for some g 10 (d) > 0. Notice that if we repeat every element of Q s the same number of times, then the multiset we obtain is still (g 9 (d, δ), 1 − g 10 (d))-forcing, so we may assume that max s∈Q ′ |Q s | ≤ 2 min s∈Q ′ |Q s |. Define Q = {s ⊗ t : s ∈ Q ′ , t ∈ Q s } = s∈Q ′ (s ⊗ Q s ). Note that as R s ⊂ g 2 (d)D s for all s ∈ C ′ , we have s ⊗ R s ⊂ g 2 (d)B ′ for all s ∈ C ′ . But the elements of Q ′ are chosen from g 3 (d)C ′ , so s ⊗ P s ⊂ g 3 (d)g 2 (d)B ′ for all s ∈ Q ′ . Finally, the elements of Q s are chosen from g 8 (d)P s , so the elements of s ⊗ Q s are chosen from g 8 (d)g 3 (d)g 2 (d)B ′ for every s ∈ Q ′ . This completes the proof of the claim. Now let us show that Q does indeed satisfy the conclusion of the lemma. Since for every s ∈ Q ′ , Q s is (h 3 (d, δ), 1−h 4 (d))-forcing, there exist subspaces V J (s) ⊂ F J 2 for every J ⊂ I c , J ∅, with dimension at most h 3 (d, δ) such that the set of arrays t ∈ F I c 2 with t.q = 0 for at least . Now if r ∈ G has r.q = 0 for at least (1 − f 2 (d))|Q| choices q ∈ Q, then by averaging, for at least (1 − 1 2 h 2 (d))|Q ′ | choices s ∈ Q ′ , we have r.(s ⊗ t) = 0 for at least (1 − h 4 (d))|Q s | choices t ∈ Q s . Therefore (noting that r.(s ⊗ t) = (rs).t), rs ∈ J⊂I c ,J ∅ V J (s) ⊗ F I c \J 2 for at least (1 − 1 2 h 2 (d))|Q ′ | choices s ∈ Q ′ . Now let g 11 (d, k) = h 2 (d) 2·2 k and g 12 (d, δ, k) = h 3 (d, δ) + 2 d+1 k. Let X be the subset of F I c 2 consisting of those arrays x such that for at least g 11 (d, k)|Q ′ | choices s ∈ Q ′ , there exists some t(s) ∈ V I c (s) (Here and below, for a subspace L ⊂ G and an array s ∈ F I 2 , we write Ls for the subspace {rs : r ∈ L} ⊂ F I c 2 .) Choose a maximal subset {x 1 , . . . , x m } ⊂ X such that no x i − x j with i j is 2g 12 (d, δ, k)-degenerate. Then there do not exist i j, s ∈ Q ′ and t ∈ V I c (s) + J⊂I,J I ((F J 2 ⊗ W J c )s) with x i − t and x j − t both g 12 (d, δ, k)-degenerate. It follows, by the definition of X and the fact that the dimension of V I c (s) + J⊂I,J I ((F J 2 ⊗ W J c )s) is at most h 3 (d, δ) + 2 d k, that mg 11 (d, k)|Q ′ | ≤ |Q ′ | · 2 h 3 (d,δ)+2 d k , and therefore that m ≤ g 13 (d, δ, k). Thus, there exists a set Y ⊂ X of size at most g 13 (d, δ, k) such that for any x ∈ X, there exists some y ∈ Y such that x − y is 2g 12 (d, δ, k)-degenerate. Let Z = span(Y). Then dim(Z) ≤ g 13 (d, δ, k), and for every x ∈ X there is some z ∈ Z such that x − z is 2g 12 (d, δ, k)-degenerate. Let ) ≤ k and the property that r.q = 0 for at least (1 − f 2 (d))|Q| choices q ∈ Q. Let Q ′ (r) be the submultiset of Q ′ consisting of those s ∈ Q ′ for which rs ∈ J⊂I c ,J ∅ V J (s) ⊗ F I c \J 2 . As we have seen two paragraphs above, |Q ′ (r)| ≥ (1 − 1 2 h 2 (d))|Q ′ |. Note that we can write r = r 1 + r 2 + r 3 + r 4 where r 1 ∈ J⊂I,J I, In particular, if t X, then the number of choices s ∈ Q ′ (r) for which r 4 s = t is at most g 11 (d, δ, k)|Q ′ |. On the other hand, notice that r 4 s ∈ H I c (r) for every s ∈ Q ′ . Since |H I c (r)| ≤ 2 k , it follows that r 4 s ∈ X for at least Let X(r) = X ∩ H I c (r). Let t 1 , . . . , t α be a maximal linearly independent subset of X(r) and extend it to a basis t 1 , . . . , t α , t ′ 1 , . . . , t ′ β for H I c (r). Now if a linear combination of t 1 , . . . , t α , t ′ 1 , . . . , t ′ β is in X, then the coefficients of t ′ 1 , . . . , t ′ β are all zero. Write Since r 4 q ∈ X for at least (1−h 2 (d))|Q ′ | choices q ∈ Q ′ , we have, for all j, that s ′ j .q = 0 for at least (1 − h 2 (d))|Q ′ | choices q ∈ Q ′ . Thus, as Q ′ is (h 1 (d, δ, k), 1 − h 2 (d))-forcing, there exist subspaces L J ⊂ F J 2 (J ⊂ I, J ∅), not depending on r and of dimension at most h 1 (d, δ, k), such that s ′ j ∈ J⊂I,J ∅ L J ⊗ F I\J 2 for all j. Thus, r 4 ∈ i≤α s i ⊗ t i + J⊂I,J ∅ L J ⊗ F J c 2 . Moreover, for every i ≤ α, we have t i ∈ X, so there exist z i ∈ Z such that t i − z i is 2g 12 (d, δ, k)-degenerate. It follows that δ, k). This finishes the proof of the lemma.
We are now in a position to complete the proof of Lemma 2.1.
Proof of Lemma 2.1. As remarked earlier, it suffices to prove that (H) holds for all d. But (H) implies (H'), and (H) holds for d = 1, therefore it suffices to prove, assuming that (H') holds for all d ′ < d, that (H) holds for d.
Let B ′ given as in the definition of (H). By Lemma 2.16 (iv), there exist Q ∅ ⊂ 4B ′ and 2 of dimension at most g 1 (δ) such that if r.q = 0 for at least 3 4 |Q ∅ | choices q ∈ Q ∅ , then r can be written as r = x + y, where x ∈ V [d] and y is g 2 (d, δ)-degenerate. By repeated application of Lemma 2.17, for all I ⊂ [d − 1] with I ∅, in the order given by ≺, we obtain for each such I a multiset Q I with elements chosen from g 3 (d)B ′ , subspaces V I ⊂ F I 2 , V I c ⊂ F I c 2 of dimension at most g 4 (d, δ), and we also obtain a constant g 5 (d) > 0 such that the following statement holds. If r ∈ G is such that for each I ⊂ [d − 1] we have that r.q = 0 for at least (1 − g 5 . Moreover, after taking several copies of each Q I , we may assume that max I |Q I | ≤ 2 min I |Q I |. Now let g 6 (d) = g 5 (d) 2·2 d−1 , let Q = I Q I , and suppose that r.q = 0 for at least (1 − g 6 (d))|Q| choices q ∈ Q. Then, by averaging, for every I ⊂ [d − 1] we have r.q = 0 for at least (1 − g 5 (d))|Q I | choices of q ∈ Q I . Thus, Q is (g 7 (d, δ), 1 − g 6 (d))-forcing, where g 7 (d, δ) = max(g 1 (δ), g 4 (d, δ)).
3. The counterexample to Question 1.9 We shall now present an example that gives a negative answer to Question 1.9. The example is easy to define, but it takes a little work to prove that it has the properties we require. In what follows, let G = F n 2 . For a vector v ∈ G write |v| for the number of entries equal to 1 in v. Then our set A will be {v ∈ F n 2 : |v| ≤ n/2 − 10 20 n 3/4 }, and our set B will be {v ∈ F n 2 : |v| = n 1/2 }. Note first that A is η-closed with respect to B where η > 0 is some absolute constant. Indeed, by the central limit theorem, when n sufficiently large, the probability that a random element u ∈ A has |u| ≤ n/2 − 10 20 n 3/4 − n 1/4 is at least some absolute constant η 1 , and conditional on this, the probability that |u + v| ∈ A for a random element v ∈ B is at least some other absolute constant η 2 , so we may take η = η 1 η 2 . What we shall prove is that for this η, with ǫ = 0.99, say, there do not exist c, δ and l with the properties described in Question 1.9. In fact, we shall prove the slightly stronger statement that for any δ > 0 and positive integer l, if n is sufficiently large then there do not exist C ⊂ A + lB and B ′ ⊂ B with |B ′ | ≥ δ|B| such that C is (B ′ , 0.99)-closed. Since for sufficiently large n, we have A + lB ⊂ A ′ = {v ∈ F n 2 : |v| ≤ n/2 − 10 15 n 3/4 }, it suffices to prove the same statement but with A + lB replaced by A ′ . From now on, we always assume that n is sufficiently large.
The proof relies on two lemmas and a definition. Let us see why these two lemmas are sufficient. Suppose that C ⊂ A ′ is (B ′ , 0.99)-closed for some B ′ ⊂ B with |B ′ | ≥ δ|B|. Let w ∈ B ′ be chosen at random. Then the expected number of u ∈ C such that u + w ∈ C is at least 0.99|C|, so by considering all such pairs {u, u + w} and noting that (u + w) + w = u, we see that there are on average at least 0.99 2 |C| choices of u ∈ C such that |u + w| ≤ |u|. Therefore, if u ∈ C is chosen at random, the average number of w ∈ B ′ such that |u + w| ≤ |u| is at least 0.99 2 |B ′ |. It follows that for at least |C|/10 elements of C the number of such w is at least |B ′ |/3, so at least |C|/10 elements of A ′ are B ′ -compatible. Lemma 3.3 then implies that C has density at most 10 exp(−n 3/4 ) in G. Let us write γ for this density.
On the other hand, since C is (B ′ , 0.99)-closed, we have the inequality which implies that Using Lemma 3.1, together with the observations that µ B ′ (u) ≤ 1 and |Ĉ(u)| ≤ γ for every u ∈ G and that u∈G |Ĉ(u)| 2 = γ, we deduce that exp(n 2/3 )γ 2 ≥ 0.01γ, so γ ≥ 0.01 exp(−n 2/3 ). For sufficiently large n, this contradicts the upper bound for γ that we obtained a few lines above.
It remains to prove the two lemmas. The next two results are preparation for the proof of Lemma 3.1. Proof. We use induction on d. The case d = 1 easily follows from the assumption on V. Let V ′ have dimension d + 1 and suppose that for a d-dimensional subspace V ⊂ V ′ , v 1 , . . . , v d and I 1 , . . . , I d have been chosen satisfying the requirements. Choose some v ∈ V ′ \ V. Replacing v by v − v 1 if necessary, we may assume that v(k) = 0 for at least |I 1 |/2 choices k ∈ I 1 . Similarly, we may assume that v(k) = 0 for at least |I i |/2 choices k ∈ I i for every i ≤ d. Thus, there exist subsets J 1 , . . . , J d of {1, . . . , n} of size at least n 8/15 /2 d each such that for every i ≤ d and every k ∈ J i we have v i (k) = 1 but v j (k) = 0 for all j with j i, j ≤ d, and v(k) = 0. Let J = {k ≤ n : v(k) = 1}. By the assumption on V ′ , we have |J| ≥ n 8/15 . Now it is easy to see that we can define v ′ 1 to be v 1 or v 1 − v and achieve that v ′ 1 (k) = 0 for at least |J|/2 choices of k ∈ J. Similarly, we can define for every k ∈ J j , and it follows that for any i ≤ d and k ∈ J i , we have v ′ i (k) = 1 but v ′ j (k) = 0 for all j i, and v(k) = 0. Thus, the set {v ′ 1 , . . . , v ′ d , v} is suitable so the lemma is proved.
Corollary 3.5. Let t be a positive integer not depending on n and let V be a subspace of F n 2 of dimension t such that every v ∈ V has |v| ≥ n 8/15 . Then the density of those w ∈ B with w · v = 0 for all v ∈ V is less than (1.9) −(t−1) .
Proof. We shall be slightly sketchy about the some of the details when they are very standard. As always, we assume that n is sufficiently large. Let v 1 , . . . , v t be a basis given by Lemma 3.4 with d = t. Let w be a random vector in B, let i < t, and let us consider the probability that w.v i = 0 given that w.v j = 0 for every j < i.
The expected number of non-zero coordinates of w in the union of the two intervals I i and I i+1 is at least n 1/30 /2 t−1 , which tends to infinity, and the probability that it is at least half this number tends to 1 (very rapidly). If we condition further on this number, and if it is indeed at least n 1/30 /2 t , then the probability that the number of non-zero coordinates of w in I i is even is almost exactly 1/2. Therefore, the probability that w.v i = 0 given that w.v j = 0 for every j < i is less than 1/(1.9).
Since this is true for every i ≤ t − 1, we obtain the result.
Proof of Lemma 3.1. Suppose that the result is not true. Let r be a positive integer to be specified later and pick R = {u 1 , . . . , u r } such that µ B ′ (u i ) ≥ 0.98 for i = 1, 2, . . . , r. Then for each i, we have u i · w = 0 for at least 99% of all w ∈ B ′ . Therefore there is a subset B ′′ ⊂ B ′ with |B ′′ | ≥ |B ′ |/2 such that each w ∈ B ′′ has u i · w = 0 for at least 98% of the u i . The number of subsets of R of size 49 50 r is at most r 49r/50 = r r/50 ≤ (50e) r/50 ≤ (1.8) 49r/50 . Let t = 49r/50. Then there exists a subset T of R of size t such that the number of w ∈ B with w · u = 0 for all u ∈ T is at least Choose the smallest positive integer t with δ 2·(1.8) t ≥ (1.9) −(t−1) (and with r = 50t/49 an integer). Then the density of those w ∈ B with w · u = 0 for all u ∈ T is at least (1.9) −(t−1) . Now let Q be the set of all u ∈ F n 2 withμ B ′ (u) ≥ 0.98 and assume that |Q| ≥ exp(n 2/3 ). Let t and r be as above and choose u 1 , . . . , u r ∈ Q such that for every j, the (Hamming) distance of u j from span(u 1 , . . . , u j−1 ) is at least n 8/15 . (This is possible because the number of u ∈ F n 2 with Hamming distance at most n 8/15 from an r-dimensional vector space is at most 2 r exp(O(n 8/15 log n)) < exp(n 2/3 ).) Applying Corollary 3.5 to V = span(T ), where T is a subset of {u 1 , . . . , u r } of size t, we find that the density of those w ∈ B with w · u = 0 for all u ∈ T is less than (1.9) −(t−1) , which is a contradiction.
Proof of Lemma 3.3. In this proof, unless specified otherwise, we will view F n 2 as a subset of R n and accordingly, the dot product is defined as u · w = i u(i)w(i) where the summation is in R. Then |u + w| ≤ |u| is equivalent to u · w ≥ |w|/2. Hence u is B ′ -compatible if u · w ≥ |w|/2 for at least |B ′ |/3 vectors w ∈ B ′ .
Let t be a fixed positive integer, not depending on n, to be specified later. For a multiset T = {u 1 , . . . , u t } ⊂ A ′ write s T = t i=1 u i − t 2 q where q is the vector in F n 2 consisting of ones. Let a k = s T (k) and σ 2 T = n k=1 a 2 k . We say that T is bad if σ 2 T ≥ 1000tn.

Claim 1.
If T is not bad, then the number of w ∈ B with u i · w ≥ |w|/2 for all i is at most |B| 100 t . Proof of Claim 1. If u i · w ≥ |w|/2 for all i, then s T · w ≥ 0. Note that s T · w = k≤n a k w(k). We shall view w as a random variable, chosen uniformly of all elements of B. What we need to prove is that P[ k≤n a k w(k) ≥ 0] ≤ 1 100 t . Let m = n 1/2 and let w 1 , . . . , w m be standard basis vectors of F n 2 , chosen independently and uniformly at random. Note that the expected number of i j such that w i = w j is at most 1, so almost surely this number is at most log n. In particular, almost surely we have n 1/2 − 2 log n ≤ |w 1 + · · · + w m | ≤ n 1/2 . Choose uniformly randomly an element w ∈ B with minimal Hamming distance from w 1 + · · · + w m ∈ F n 2 . This algorithm defines a uniformly random element of w ∈ B such that almost surely we have k≤n | i≤m w i (k) − w(k)| ≤ k≤n | i≤m w i (k) − ( i≤m w i )(k)| + k≤n |( i≤m w i )(k) − w(k)| ≤ 2 log n + 2 log n = 4 log n, where all the summations are taken in R, except i≤m w i , which is taken in F n 2 . At this point, we apply the following version of Chernoff's inequality, which appears as Theorem 3.4 in [2].
Let X i (1 ≤ i ≤ m) be independent random variables satisfying X i ≤ E[X i ] + M, for 1 ≤ i ≤ m. We consider the sum X = i X i with expectation E[X] = i E[X i ] and variance Var(X) = i Var(X i ). Then, we have P(X ≥ E[X] + λ) ≤ exp(− λ 2 2(Var(X)+Mλ/3) ). We now take X i = k≤n a k w i (k) for 1 ≤ i ≤ m. Since |a k | ≤ t, the conditions of the theorem hold with M = 2t. As u i ∈ A ′ for all i, we have k≤n a k ≤ t(n/2 − 10 15 n 3/4 ) − tn/2 = −10 15 tn 3/4 . Then Proof of Claim 2. Recall that T is bad if and only if k≤n ( i≤t u i (k)−t/2) 2 ≥ 1000tn. u 1 , . . . , u t are randomly chosen from A ′ but with probability 1−o(exp(n −7/8 )) all of them have |u i | ≥ n/2−n 99/100 so we may assume that u 1 , . . . , u t are randomly chosen from the set A ′′ = {v ∈ F n 2 : n/2 − n 99/100 ≤ |v| ≤ n/2 − 10 15 n 3/4 }. It is not hard to see that we can write u i = x i + y i where x i and y i are random variables taking values in F n 2 and having the property that x i (k) are independent Bernoulli with parameter 1/2 and |y i | ≤ 2n 99/100 with probability 1 − o(exp(−n 7/8 )). Then it suffices to prove that We are now in a position to complete the proof of the lemma. Let t be the smallest positive integer with 1 100 t < δ 100(6e) t . Let the density of B ′ -compatible elements in A ′ be α. Pick v 1 , . . . , v 6t independently and uniformly randomly from A ′ . Then with probability α 6t , every v i is B ′ -compatible. If that is the case, then for every i, there are at least |B ′ |/3 vectors w ∈ B ′ with v i · w ≥ |w|/2. It follows that there is some B ′′ ⊂ B ′ with |B ′′ | ≥ |B ′ |/100 such that for every w ∈ B ′′ we have v i · w ≥ |w|/2 for at least t choices of i. The number of t-sets in {v 1 , . . . , v 6t } is at most (6e) t so there must exist a t-set T = {u 1 , . . . , u t } ⊂ {v 1 , . . . , v 6t } (multisets are allowed) such that the number of w ∈ B with u i · w ≥ |w|/2 for each i is at least |B ′′ |/(6e) t ≥ δ|B| 100(6e) t > |B| 100 t . By Claim 1, it follows that T is bad. Thus, the probability that T = {u 1 , . . . , u t } is bad when u 1 , . . . , u t are independently and uniformly randomly chosen from A ′ , is at least α 6t ( 6t t ) . Hence, by Claim 2, we have α 6t ( 6t t ) = o(exp(−n 7/8 )), and we get α = o(exp(−n 3/4 )).