Superquadratic Lower Bound for 3-Query Locally Correctable Codes over the Reals

We prove that 3-query linear locally correctable codes of dimension d over the reals require block length n > d2+α for some fixed, positive α > 0. Geometrically, this means that if n vectors in Rd are such that each vector is spanned by a linear number of disjoint triples of others, then it must be that n > d2+α . This improves the known quadratic lower bounds (e. g., Kerenidis–de Wolf (2004), Woodruff (2007)). While the improvement is modest, we expect that the new techniques introduced in this article will be useful for further progress on lower bounds of locally correctable and decodable codes with more than 2 queries, possibly over other fields as well. Several of the new ideas in the proof work over every field. At a high level, our proof has two parts, clustering and random restriction. The clustering step uses a powerful theorem of Barthe from convex geometry. It can be used (after preprocessing our LCC to be balanced), to apply a basis change (and rescaling) of the vectors, so that the resulting unit vectors become nearly isotropic. This together with the fact that any LCC must have many “correlated” pairs of points, lets us deduce that the An extended abstract of this paper appeared in the Proceedings of the Forty-sixth Annual ACM Symposium on Theory of Computing 2014 [14]. ∗Supported by NSF CAREER award DMS-1451191 and NSF grant CCF-1523816. †Supported by NSF grant CCF-1350572. ‡Supported by NSF grant CCF-1412958. ACM Classification: E.4 AMS Classification: 94B65, 52C35


Introduction
Locally-correctable codes (sometimes under different names of program self-correctors or random selfreductions), abbreviated LCCs, have the property that each symbol of a corrupted codeword can be recovered, with high probability, by randomly accessing only a few other symbols.LCCs have played a key role in important developments within several (impressively) diverse areas of theoretical computer science, which we briefly summarize below.
Blum and Kannan [9] introduced the idea of probabilistic, local correction for the purpose of program checking.With the follow-up papers [10] on linearity testing and [27] on low-degree testing this sequence inaugurated the field of Property Testing and Sublinear Algorithms.The realization of [25,7], that Reed-Muller codes (namely low-degree multivariate polynomials) are locally correctable, gave the first random self-reducibility examples of very hard functions like the Permament, and this average-case to worst-case complexity reduction was useful for pseudo-random generators [4].It further lead (with many more ideas) to the celebrated sequence of characterizations of the power of probabilistic proofs, IP = PSPACE by [26,28], MIP = PSPACE by [3] and PCP = NP by [2,1].Close cousins of LCCs, Locally-Decodable Codes (LDCs), 1 formally introduced in [19] but having their origins in these earlier works, were key to Private Information Retrieval and other models of secure delegation of computation (see, e. g., [11]).Dvir [12] has shown that sufficiently strong lower bounds on LCCs would yield explicit rigid matrices, which are related, via the work of Valiant [29] to circuit complexity. 2 While this has not materialized yet, it motivated the invention of multiplicity codes by [23] which are new LCCs of high rate, and turn out to yield optimal list-decodable codes as well [22].Finally, since the work of Dvir and Shpilka [16], LDCs and LCCs have played a role in understanding basic problems in Polynomial Identity Testing and established its connection to problems in Incidence Geometry, e. g., [20,5,15].
The most important parameters of LCCs are the number of queries, q, made by the correcting algorithm, and the block length n as a function of the message length (or dimension, for linear codes) d, where we fix corruptions to some small fixed fraction, say 1%.For upper bounds, the best constructions we have are still based on Reed-Muller codes 3 which exist only over finite fields.For q queries these require block length about exp(d 1/(q−1) ).Indeed most applications require the block-length n to be polynomial in d and hence using these codes forces the number of queries to be at least logarithmic.
Finding better codes, and in particular constant-query, polynomial block-length LCCs, has been a major challenge, and this challenge naturally turns attention to the limits of constant query LCCs and LDCs.
On the lower bound front, relatively little is known to rule out the feasibility of the challenge above.We shall restrict ourselves to linear codes4 over a field F, namely when the set of codewords is a subspace of F n of dimension d.We will denote by q-LCC such a linear locally-correctable codes with q queries.It is easy to see that 1-LCCs do not exist over any field.The first set of interesting results came for 2-LCCs, and here strong lower bounds are known through a variety of techniques.An exponential n > 2 Ω(d) lower bound via isoperimetric/entropy methods for 2-LCCs over F 2 follows from the same methods as for the (weaker) LDCs [18,21,16] and is matched by the Hadamard code whose generating matrix is composed of all binary vectors over F 2 .Strangely, while these vectors provide an LDC over every field, they fail to be an LCC except over F 2 .This gap was first explained in [5,15] where the authors showed that over the real numbers (and indeed even over large enough finite fields), 2-LCCs simply do not exist!For every error-rate δ the dimension d for which such codes exist is finite, and cannot exceed poly(1/δ ).The proofs here use a combination of geometric, analytic and linear-algebraic techniques, and give quantitative form to known qualitative point-line incidence theorems.Tighter bounds of n > p Ω(d) over finite fields of prime order p were proved in [8] using methods from arithmetic combinatorics, matching the trivial construction of taking all vectors in (F q ) d .
For q ≥ 3 the known lower bounds are far weaker, and practically only one lower-bound technique is known: random restrictions of the given code which reduce the number of queries q to 2 or 1, appealing to the lower bounds above.This technique was introduced for LDCs by Katz and Trevisan [19].The resulting lower bounds trivially hold for (the stronger) LCCs as well.The best bounds known are due to [21,30], which show that linear q-LDCs, over any field F, must satisfy n = Ω(d 1+1/( q/2 −1) ) for every q ≥ 3. So, in particular, the best lower bound for 3-LDCs (or LCCs) is quadratic, n = Ω(d 2 ).(For linear codes the Ω was replaced by Ω in [31].)Our main result is breaking this quadratic barrier for 3-LCCs over the real numbers.Over the reals there are no known constructions of constant-query LCCs (of any rate!).We prove that for some fixed constant 5 α > 0 every linear 3-LCC over the reals must satisfy n = Ω(n 2+α ), even when the error parameter δ is allowed to be polynomially small in n.To this end, we introduce several new ideas and techniques, which we hope will lead to further progress.Some of our ideas are general enough to work over any field, while others are specifically tailored for the reals.We briefly discuss now the main sources for our improvement over the known quadratic lower bound.A more detailed overview of the proof is given after the formal statement of the theorem in the next section.

Clustering and restrictions
A linear 3-LCC of dimension d and block length n over F may be viewed as a set V ⊂ F d of n vectors (which form the generating matrix of the code), together with n collections M v , one for each v ∈ V .Each M v is a matching of δ n disjoint triples from V , and each of the triples in M v spans v.This structure is easy to deduce for linear codes from the more traditional definition using a randomized decoder (cf.Definition 2.1).
We now informally describe a way to obtain a possible quadratic lower bound on n, which uses random restriction to reduce the dimension of the code.Pick a set A ⊂ V of size about √ n at random.Then, take a linear projection whose kernel is exactly the span of the vectors in A and apply it to the elements of V .Notice that in expectation, for every v ∈ V , a pair of points in A will be contained in some triple in M v .Thus, after the projection the third point in that triple will become the same as v (up to scaling).As this happens to every point, we expect V to shrink by a factor of 2! Notice that in such a projection, the dimension of V can drop by at most |A| ≈ √ n.Repeating this process logarithmically many times will shrink V completely, revealing that its original dimension could not have been larger than √ n log n, giving a near quadratic relation n ≥ d 2 / log d.We note that the proofs appearing in the literature are somewhat different then the one we just described.Indeed, there are several possible ways of using a random restriction argument to get a quadratic bound (up to poly-logarithmic factors) for linear 3-LCCs.The argument above is new to this paper, and is indeed a simplified variant of our actual proof, which improves its analysis over the reals.
It is not hard to see that if the collection of triples in all of matchings M v were chosen at random, the analysis above could not be improved.But a random collection is far from being an LCC.Indeed, in contrast to standard codes, which exist in abundance and a random subspace is one with high probability, locally correctable (or decodable, or testable) codes are extremely rare and structured.This raises the question of what other structural properties are imposed on the matchings M v in an LCC.In this paper we reveal a new such property, clustering, at least when the underlying field is the reals. 6We conclude with a simplified description of this clustering property, how it is obtained, and how it enables better analysis of the random restriction process.
A collection {M v } of matchings of triples is said to be clustered if there are about √ n subsets S 1 , . . ., S √ n of V , each of size about √ n, such that every triple in every matching M v has a pair in one of these sets.Note that such a configuration is extremely far from random.Indeed, as these sets have at most n 3/2 pairs between them, many of the triples (of different matchings) share pairs (a typical pair exists in about √ n triples!).Note that this cluster structure is completely combinatorially described.Why should the triples in a 3-LCC admit such a clustering?The main observation is that, over the reals, a small linearly dependent subset, such as a 4-tuple composed of v and a triple from M v , must contain a pair which is significantly correlated (say, with inner product at least 1/4 for said example).Thus, a 3-LCC must contain many correlated pairs.On the other hand, a powerful result of Barthe from convex geometry allows us to deduce that, after a carefully chosen change of basis, the vectors of our code are almost isotropic, meaning that they point roughly equally in all directions in space.This implies that most pairs are hardly correlated at all.These two seemingly contradicting structures can exist only if the points in V are geometrically clustered.Delicate analysis shows that they can be partitioned into roughly √ n small balls.The correlations then must arise from triples containing a pair in one of the (geometric) clusters.
Why does clustering help?Let us return to the random restriction and projection argument above, but let us pick now the set A as follows.First pick one of the clusters S i uniformly at random, and inside it pick A at random of size about n 1/4 .The clustering ensures that this much smaller set has a pair intersecting each of the matchings M v in expectation (due to the fact that a typical pair in a typical cluster participates in √ n matchings).So a much smaller set A suffices to create the same effect after projection, namely a shrinking of the set V by a factor of 2. Again a logarithmic number of such restrictions is likely to shrink V completely, giving a dimension upper bound of n 1/4 log n, and yielding the lower bound n ≥ d 4 / log d.We note again that this part works over any field, as long as the triples are clustered.
"Balanced" codes.A recurring notion in our proof is that of an LCC in which no large subset of the coordinates lies in a subspace of significantly lower dimension.One can think of such codes as being "balanced" in the sense that they cannot be "compressed" (by projecting the large set of low dimension to zero).Our proof contains a sequence of reductions, used to obtain certain conditions that are used in the clustering and restriction steps.Each of these reductions can only be carried out if the code is "balanced" and this property is used in several different ways in the proof.If the code is not "balanced" we can use an iterative argument that projects the large low-dimensional subset to zero.We find this condition of being balanced a very natural one in the context of LCCs (and other codes) and hope it could be useful as a conceptual tool in future works.
Organization.In Section 2 we state our results formally.Then, in Section 3 we provide a more detailed and technical overview of the proof.The organization of the rest of the paper (which contains a complete proof of our main result) is given at the end of Section 3.
Acknowledgments.We thank the anonymous referees for their careful reading of the paper and for many useful comments.We are grateful to Boaz Barak, Moritz Hardt and Amir Shpilka for their contribution in early stages of this work.In particular, we thank Moritz Hardt for introducing us to Barthe's work.

Definitions and results
For a string y ∈ F n , we define w(y) to be the number of nonzero entries in y.A q-matching M in [n] is defined to be a set of disjoint unordered q-tuples (i.e., disjoint subsets of size q) of [n].Definition 2.1 (Linear q-LCC, decoder definition).A linear (q, δ )-LCC of dimension d over a field F is a d-dimensional linear subspace U ⊂ F n such that there exists a randomized decoding procedure D : F n × [n] → F with the following properties.
1.For all x ∈ U, for all i ∈ [n] and for all y ∈ F n with w(y) ≤ δ n we have that D (x + y, i) = x i with probability at least 3/4 (the probability is taken only over the internal randomness of D).
2. For every y ∈ F n and i ∈ [n], the decoder D(y, i) reads at most q positions in y.
Definition 2.2 (Linear q-LCC, geometric definition).Let V = (v 1 , . . ., v n ) ∈ (F d ) n be a list of n vectors spanning F d .We say that V is a linear (q, δ )-LCC in geometric form if for every v ∈ V there exists a q-matching M v in [n] of size at least δ n such that for every q-tuple { j 1 , . . ., It is well known that any linear (q, δ )-LCC (over any field) can be converted into the geometric form given above by replacing δ with δ /q.The transformation is simple.take v 1 , . . ., v n ∈ F d to be the rows of THEORY OF COMPUTING, Volume 13 (11), 2017, pp.1-36 the generating matrix of U. Clearly, this does not change the dimension of the code.This is surprising since it implies also that the decoder in the first definition can be made non adaptive without much loss in parameters (for linear codes).
In our results we will assume that the error parameter δ is not too small as a function of n.Specifically, we will require that n ≥ (1/δ ) ω (1) .This condition can be replaced with n ≥ (1/δ ) C for a sufficiently large absolute constant C which can be calculated from the proof.
We now state our main result which bounds the dimension of 3 query LCC's when the underlying field is R.

Theorem 2.3 (Main Theorem). There exists an absolute constant
3 Proof overview: "Cluster and Restrict" method From a high level, our proof is divided into two conceptually distinct steps.
1. Clustering step.Show that the triples used in the matchings M v , v ∈ V are "clustered" in some precise sense (described below).
2. Restriction step.Use the clustering to find a large subset of V that has low dimension.The name of this step is due to the fact that it uses a random restriction argument (projecting a random subset to zero).
Combining these two steps (in Lemma 10.1) we get that V must have a large subset (of size Ω(n)) with low dimension (at most n 1/2−ε ).Using this to prove a global dimension bound on V (as in Theorem 2.3) is done using a standard amplification lemma (Lemma 10.2) similar to that in [5,8].For simplicity, we will use big "O" notation to hide constants depending on δ (only for this overview).
We now describe each of these steps in more detail.The fact that V is a code over R is only used in the clustering step.The restriction step works over any field, provided that the triples are already clustered.A recurring theme in the proof is that we are always free to assume that V does not have a large subset of low dimension.Another recurring operation is "sending a subset U of V to zero."By this statement we mean: pick a linear map A whose kernel is span(U) and apply it to all the elements of V .We will use the simple fact that, if dim(U) = r and dim(V The clustering step is given by Lemma 8.2 which we state now in an informal form.We will elaborate below on the two conditions appearing in the lemma (the well-spread vectors condition and the low triple-multiplicity condition).Recall that V is associated with n 3-matchings M v , v ∈ V used in the decoding.
Lemma 3.1 (Informal statement of Lemma 8.2).Suppose V is a (3, δ )-LCC that satisfies the well-spread vectors condition and the low triple-multiplicity condition and suppose that d > n 1/2−ε .Then there are subsets S 1 , . . ., S m ⊂ V (not necessarily disjoint) so that 3. each triple in each matching M v has two of its elements in one of the sets S i .
Before we explain the two conditions in the lemma, i. e., the well-spread vectors condition and the low triple-multiplicity condition, notice that the existence of sets S 1 , . . ., S m as above is something that does not hold for a "typical" family of Ω(n 2 ) triples.In fact, if the triples were chosen at random there would not be such sets with probability close to one.Referring to the sets S i as "clusters" is also justified by the fact that they actually form clusters in R d (i.e., they are all correlated with some fixed point).This geometric fact, however, is not used anywhere in the proof-all we need is the combinatorial structure.We now explain the two conditions on the code V mentioned in the lemma.
• Well-spread vectors condition.The vectors v 1 , . . ., v n comprising V should be is some sense well spread.Observe that w. l. o. g. by a suitable scaling to each vector, we can assume that the vectors v 1 , . . ., v n are unit vectors, and we will make this assumption.Formally, we require that for every unit vector w ∈ R d we have ).This means, in particular, that every small ball can contain at most O(n 1/2+ε ) vectors.Clearly, a general LCC V does not need to satisfy this condition.For example, if V has a large subset of low rank such a statement cannot hold (using a pigeon hole argument on the unit sphere in low dimension).We are able, however, to reduce to this case using Lemma 6.1, which uses a powerful result of Barthe (Lemma 5.1) that is developed in Section 5. Roughly speaking, Barthe's theorem can be used to show that, unless V has a large subset of low rank there is an invertible linear map M on R d so that, if we replace each v i with Mv i / Mv i , the well-spread vectors condition is satisfied.The proof of this result (part of which appear in Section 5) uses tools from convex geometry.We derive a particularly convenient form of Barthe's theorem as Theorem 6.4 which might be of independent interest.
• Low triple-multiplicity condition.This condition requires that a single triple does not appear in "too many" (roughly n O(ε) ) different matchings.In Section 7 we prove Lemma 7.2 which shows how to reduce to this case, assuming V does not have a large subset of low rank.The reduction uses the fact that if a single triple is used in too many matchings, then projecting the elements in this triple to zero causes many other points to go to zero.If a point v is mapped to zero as a result, and if v is used in many triples (say Ω(n)) all of these triples "become" pairs when v maps to zero.Using this observation, we show that we can send a relatively small number of points to zero and construct a 2-query locally decodable code (LDC) of relatively high dimension.We then apply the known bounds for 2-query LDCs (these are variants of LCCs and described in Section 4) to get a contradiction.This reduction is also field independent and does not use any properties of the real numbers.
The main observation leading to clustering is that we can assume, w. l. o. g., that all triples (i, j, k) ∈ M v are so that the three vectors v i , v j , v k are almost orthogonal to v.This follows directly from the well-spread vectors condition by bounding from above the number of vectors correlating with v and discarding the corresponding triples from M v (for each v ∈ V ).Once we have this condition, we observe that since v, v i , v j , v k are linearly dependent and, since v is not correlated with the other three vectors, we must have that v i , v j , v k are close to being in a two dimensional plane.(Recall that these are all unit vectors.)This means that in each triple there must be two elements that are correlated with each other!This is already a non trivial fact, in particular since we know (by the well-spread vectors condition) that each point cannot be correlated with many other points.
Proceeding with a more careful analysis of the different types of triples that can arise, and using some graph theoretic arguments, we arrive at the required clusters.In this step we use the bound on the maximum triple-multiplicity.
Note that the clustering lemma implies that there are many pairs in V ×V that appear in many triples.This is due to the simple upper bound of n 1.5+O(ε) on the total number of possible pairs in all of the clusters S 1 , . . ., S m and the fact that together they cover pairs from a quadratic number of triples.This should be contrasted with the results of [5,15] which prove strong lower bounds for q-LCC's (for any constant q) in which every pair is in a bounded number of triples (these are called "design" LCCs).

Restriction step
The restriction step (given in Lemma 9.1) shows that if V satisfies the clustering condition (given in Lemma 8.2) then it contains a large subset of low rank.We now state a simplified form of this lemma. 7emma 3.2 (Informal statement of Lemma 9.1).Let F be a field.
This step is called the "restriction step" since it uses the "clusters" S 1 , . . ., S m found in the clustering step to show (Lemma 9.2) that there is a small set U ⊂ V (of size roughly n 1/4+7ε ) such that, projecting all elements of U to zero, reduces the dimension of V to at most n 10ε .This will imply a dimension bound of n 1/4+7ε + n 10ε on the initial dimension of V .(The reason we do not get a n 1/4+7ε upper bound on the dimension of V is due to the clustering step.) The starting point for the proof of this lemma is the following simple observation.If v is spanned by a triple (v i , v j , v k ), then projecting two elements of that triple, say v i , v j , to zero makes the two vectors v, v k proportional to each other.(This uses the fact that v is not spanned by any proper subset of the triple, and we can easily reduce to this case.)Now, suppose that there are t triples in the code that have at least two element in U. Then projecting U to zero makes makes t pairs of vectors proportional to each other (as in the v, v k example).Consider the graph on vertex set V in which we add an edge for each proportional pair v, v k obtained by sending a pair v i , v j ∈ U in a triple (v i , v j , v k ) ∈ M v to zero.Since the property of being proportional to each other is an equivalence relation on R d , we can bound the dimension of V after projecting U to zero by the number of connected components of the graph.This leaves us with the task of finding a set U so that the resulting graph has at most n 10ε components.To find such a U we use a probabilistic argument.We will pick U at random according to a particular distribution and then argue that the expected number of connected components is small.To pick the random U we proceed in r ∼ n 4ε steps as follows.In each step pick one of the clusters S i at random and then pick a random subset of S i of size ∼ n 1/4+3ε .The union of these sets will be U.The upper bound on the expected number of components is derived by considering the (expected) reduction in the number of connected components in each of the r steps.Consider some connected component and let v be some vector in it.We can assume the component is not too large, since the number of large components is trivially bounded (large being close to n 1−ε ).Since each M v is a matching, the random choice of the vectors in the i'th step will (with good probability) add an edge to v with a neighbor that is not likely to land in the connected component containing v. Hence, with good probability the connected component will "merge" with another component.Carefully analyzing this process gives us the required bound.

Organization
We begin with some general preliminaries and notation in Section 4. In Section 5 we describe (and sketch the proof of) Barthe's theorem which is used in Section 6 to reduce to the case that the points in V are well-spread.In Section 7 we show how to reduce to the case that V has low triple multiplicities.Section 8 contains the proof of the clustering step and Section 9 contains the proof of the restriction step.Finally, in Section 10 we show how to put all the ingredients together and prove Theorem 2.3.

General preliminaries 4.1 Choice of notation
Lists vs. multisets.The reason we are treating V as a list and not as a set is that V might have repetitions.For instance u and v might be distinct elements in the list V , but might correspond to the same vector in F d .The repetition corresponds to the fact that there might be repeated columns in the generator matrix of the code, which may potentially make the property of local correction easier to satisfy.Indeed in the recent lower bounds for 2-query LCCs [8,5], handling the fact that there might be repetitions added significant complexity to the proofs of the lower bounds.In the current paper too we deal with repetitions by treating V as a list.An equivalent treatment would be to treat V as a multiset, and we make no distinction between these notions.We think of a multiset as an ordered list of elements which might contain repeated elements.If A is a multiset/list, we call B a subset of A if B is another multiset/list obtained by taking a subset of A. We will say that B and C are disjoint subsets of A if they are both obtained from sub-lists on disjoint subsets of the indices.When referring to the size of a multiset we will always count the number of elements with multiplicities (unless we state explicitly that we are counting distinct elements).
Although we defined a matching to be a set of tuples in [n], when we are dealing with a specific list V = (v 1 , . . ., v n ), we might identify a tuple ( j 1 , . . ., j q ) of a matching with the tuple (v j 1 , . . ., v j q ), and we use these two notions interchangeably.Moreover, a matching M v denotes the matching corresponding to a particular element v ∈ V , and if u and v are different elements of V , even if they correspond to the same vector in F d , then M u and M v could be different matchings.

Basic operations on LCCs
For a list V ∈ (R d ) n we denote by span(V ) the subspace spanned by elements of V and by dim(V ) the dimension of this span.
The following simple observation shows that a sufficiently large subset of an LCC is also an LCC.
are any matchings used in the decoding of V then we can take the matchings for the new code U to be subsets of the old matchings.
Proof.Observe that in each matching M v , there are at most (δ /2)n triples that contain an element outside U. Thus, in U we could construct matchings of size (δ /2)n ≥ (δ /2)|U|.The claim about the dimension follows from the fact that U contains triples spanning all of the elements of V (not just those in U).
Another simple observation is that applying an invertible linear map to the elements of V preserves the property of being an LCC.

Lower bounds for 2-query LDCs
One of the ingredients in the proof will be a strong (exponential) lower bound on the length of linear 2-query Locally Decodable Codes (LDCs), which are weaker versions of LCCs.As with LCCs there are two ways of defining LDCs.Definition 4.3 (linear q-LDC, decoder definition).A linear (q, δ )-LDC over a field F is a linear ddimensional subspace U ⊂ F n , and a set of d coordinates j 1 , j 2 , . . .j d ∈ [n] such that the projection of U on to those d coordinates is full dimensional, 8 and such that there exists a randomized decoding procedure D : F n × [d] → F with the following properties: 1.For all x ∈ U, for all i ∈ [d] and for all y ∈ F n with w(y) ≤ δ n we have that D (x + y, i) = x j i with probability at least 3/4.(The probability is taken only over the internal randomness of D.) 2. For every y ∈ F n and i ∈ [d], the decoder D(y, i) reads at most q positions in y.
Let {e 1 , e 2 , . . ., e d } be the set of standard basis vectors in R d .As with LCCs, taking the rows of the generating matrix (and possibly applying an invertible linear map that sends them to the e i s) allows us to move to the geometric form.This might require us to replace δ with δ /q.Definition 4.4 (linear q-LDC, geometric definition).Let V = (v 1 , . . ., v n ) ∈ (F d ) n be a list of n vectors spanning F d .We say that V is a linear (q, δ )-LDC in geometric form if for every i ∈ [d] there exists a q-matching M i in [n] of size at least δ n such that for every q-tuple {v j 1 , v j 2 , . . ., v j q } ∈ M i it holds that e i ∈ span{v j 1 , v j 2 , . . ., v j q }.We denote by d = dim(V ).Theorem 4.5 (lower bounds for 2-LDC [16]).Let δ ∈ [0, 1], F be a field, and let

Codes in regular form
In the restriction step, it is convenient for us to assume that for each triple (v i , v j , v k ) ∈ M v each element of the triple is "used" in decoding to v. Indeed in Claim 4.7, we show how we can easily reduce to this case provided that no large subset of V has low rank.More precisely, for x, y, z ∈ R d , let us denote by span * {x, y, z} the set of all elements of the form αx + β y + γz with α, β , γ ∈ R, such that α, β , γ are all nonzero.
and dimension d = d, that is in regular form.Moreover, given any matchings M v for the code V we can take the new (regular) matchings M v for V to be sub-matchings of the original ones.
Proof.Call a triple (x, y, z) ∈ M v bad if there is a proper subset of it that spans v, i. e., v ∈ span * {x, y, z}.If there were (δ /2)n points v ∈ V , each with at least (δ /10)n bad triples in M v , then we could use these bad triples to construct a (2, δ /10)-LDC of size less than n decoding ω((1/δ ) log(n)) linearly independent elements of V .This would give a contradiction using Theorem 4.5 and the assumption on the dimension of any set of size (δ /2)n in V .Therefore, there are at most (δ /2)n points v ∈ V with many (at least (δ /10)n) bad triples.Throwing away this set, and removing all triples containing them (as well as all bad triples from the other matchings) gives us the code V a required (as in Claim 4.1).

Barthe's theorem
The main purpose of this section is to derive Lemma 5.1, a result of F. Barthe [6] which, given a set of points sufficiently close to being in general position, finds a linear transformation that "moves" these points so that their "directions" point in a close to uniform way.More precisely, for a set U = (u 1 , . . ., u n ) ∈ (R d ) n let B(U) be the set of all subsets of [n] of size d such that the corresponding vectors of U form a basis of R d .Suppose that there is a distribution µ supported on B(U) such that when sampling a random basis from µ, each element of U is chosen with some good probability.Then there is an invertible linear transformation such that after normalizing, the new points are "approximately isotropic."This result is formalized in Lemma 5.1 which we state below.
and suppose µ is a distribution supported on B(U) such that for all j ∈ S α ≤ Pr THEORY OF COMPUTING, Volume 13 (11), 2017, pp.1-36 Then, there exists an invertible linear map M : R d → R d so that, denoting û j = Mu j / Mu j , we have for all unit vectors w ∈ R d Observe that if the vectors are in general position then the uniform distribution on distinct d-tuples gives α = d/n, in which case we would get One can just assume the lemma above which follows in a straightforward way from from [6], and skip to the next section.However for completeness, we present a proof here.Before we give the proof, we first set up some notation.
For a finite set S, a distribution supported on S is a function µ : S → [0, 1] so that ∑ x∈S µ(x) = 1.For two vectors u, v ∈ R d we denote by u ⊗ v the tensor product of u and v, namely the d × d matrix with entries A i j = u i v j .We denote by I d×d the d × d identity matrix.For u ∈ R d we denote by u the Euclidean (or 2 ) norm.
We denote by U I = (u i ) i∈I the sub-list of U with indices in I.We denote by the set of index sets corresponding to sub-lists of U of length d which are linearly independent (and so span R d ).For each I ⊂ [n] we let 1 I ∈ R n denote the indicator vector of the set I. Finally we denote by K(U) ⊂ R n the convex hull of the vectors 1 I for all I ∈ B(U).We denote by K(U) o the relative interior of K(U). 9laim 5.3 (Properties of K(U)).Let U = (u 1 , . . ., u n ) ∈ (R d ) n be a list of n points spanning R d .Let µ be a distribution supported on B(U).For each j ∈ [n], let γ j ∈ [0, 1] be the probability that j ∈ I, when I ⊂ [n] is sampled according to µ.Then γ = (γ 1 , . . ., γ n ) is in K(U).
Proof.The vector γ is easily seen to be equal to the convex combination Then there exists a real invertible d × d matrix M such that, denoting û j = Mu j / Mu j , we have n ∑ j=1 γ j • ( û j ⊗ û j ) = I d×d . (5.1) Proof.We will show how the proof follows from one of the propositions proved in [6] (whose proof we will not repeat here).The idea is to define a certain optimization problem parametrized by γ and to show that the maximum is achieved for all γ ∈ K(U).Then, the matrix M will arise from equating the gradient to zero at the maximum and solving the resulting equations.We start by defining the optimization problem.For t ∈ R n we define Notice that X(t) has a positive determinant for all The optimization problem is defined as We now state a claim from [6] which gives sufficient conditions for the supremum φ * (γ) to be realized.
Let t * ∈ R n be a maximizer given by the claim.We can now use the fact that the partial derivatives all vanish at the point t * .Recall that at all points where A is invertible [24, Ch. 9, Thm.4].Taking the derivative of f at t * then gives Since X(t * ) −1 is positive definite, there exists a symmetric matrix M so that M 2 = X(t * ) −1 .Plugging this into the last equation and using properties of the trace function, we get This means that Multiplying by M from both sides we get as was required.
Proof of Lemma 5.1.Let γ ∈ R n be such that This means we can find γ ∈ K(U) o of distance at most ε from γ for all ε > 0. Hence, we can choose ε sufficiently small so that α/2 ≤ γ j for all j ∈ S. Using Theorem 5.4 we get that there exists an invertible M so that Multiplying by the column vector w from the left and by the row vector w t from the right we get that This completes the proof.
6 Reducing to the well-spread vectors case In this section we prove a lemma saying that, when analyzing an LCC V = (v 1 , . . ., v n ) over R, we can assume that the elements of V are unit vectors pointing in well-spread directions.The notion of well-spread vectors that we use is that given by Barthe's theorem (Lemma 5.1).More formally, the lemma will say that any list of vectors can be transformed into a list that is well spread as long as it does not contain a large subset of low rank.We formalize this result in Theorem 6.4.Below we state a lemma which basically follows as a corollary of the above theorem when the original list of vectors is an LCC.We first state and prove this lemma.
)n, and an invertible linear map M : R d → R d so that, denoting û j = Mu j / Mu j , we have for all unit vectors w ∈ R d . ∑ Recall that (Observation 4.2) applying an invertible linear map to the elements of an LCC V preserves the property of being an LCC.Hence, if we are aiming to prove that a (3, δ )-LCC V has a large subset of low rank we could use Lemma 6.1 to reduce to the case that the points of V are well spread.
We will prove Lemma 6.1 using Lemma 5.1.Recall that, Lemma 5.1 provides us with sufficient conditions under which a linear map M as in the lemma exists.Namely, that there exists a distribution µ on spanning d-tuples of V which hits each element in V with probability not too small.We will show that, if this condition does not hold (that is, if such a µ does not exist), we can find a large low-rank THEORY OF COMPUTING, Volume 13 (11), 2017, pp.1-36 subset V .The high-level idea is to consider the greedy distribution on d-tuples that is sampled as follows: iteratively pick a random unspanned element from V and add it to the spanning set until we cover all of V .If this distribution gives low probabilities for many elements of V then we show that it must be due to the fact that these elements lie in some low-dimensional subspace.The following definition will be crucial for this argument.Definition 6.2 ((η, τ)-independent set).Let U = (u 1 , . . ., u n ) ∈ (R d ) n be a list of n points spanning R d .We say that U is (η, τ)-independent, if there exists a distribution µ supported on B(U), and a set S ⊆ Since every I ∼ µ has exactly d elements, observe that for every distribution µ, Moreover, if the points were in "general position," i. e., every d of the points were linearly independent, then by taking the distribution µ to be the uniform distribution on d-tuples with distinct elements, we would get a (0, 1)-independent set.
If U is not (η, τ)-independent, then there exists a subspace W of dimension at most τd which contains at least ηn elements of U.
Proof.We construct a subset S ⊂ [n] by the following greedy process.Start with S = / 0. At the jth step we check whether the vectors {u i | i ∈ S} span a subspace of dimension at least τd.If they do, we add to S a tuple S j of size τd that is linearly independent.(That is, {u i | i ∈ S j } are linearly independent vectors.)If {u i | i ∈ S} have dimension lower than τd we halt.Let W be the subspace spanned by the complement of S at the end of this process.Notice that W has dimension at most τd.Now, consider the following distribution on B(U).We first pick uniformly at random one of the sets S j described above and add to our basis the corresponding (linearly independent) elements of U. Then we complete this set to a basis in some fixed way.For example, this can be done by taking the first basis in some fixed order that contains the elements of S j .For each element in S, the probability of picking it to be in the basis is τd /|S| ≥ τd/n.Since we are assuming that U is not (η, τ)-independent, the size of S c must be at least ηn.By the definition of W , this completes the proof.
Proof of Lemma 6.1.Applying Lemma 6.3 we get that V must be (δ /2, 2β )-independent.Otherwise, V would contain a subset V of size (δ /4)n and dimension at most 4β d (contradicting the assumption in the lemma).Hence, there exists a distribution µ on B(U) and a set S ⊂ . ., u n ) with n = |S|.Lemma 5.1 now implies that there there exists an invertible linear map M so that, denoting û j = Mu j / Mu j , we have for all unit vectors w ∈ R d Notice that U is a (3, δ /2)-LCC since the complement of U can intersect at most δ n/2 triples from each matching in V .This completes the proof of the lemma.

A convenient form of Barthe's theorem
The proof of Lemma 6.1 actually gives a more general result (not mentioning LCCs) that might be of independent interest.Proof.The conditions on V and Lemma 6.3 imply that V is (2α, β /2)-independent.Then, using Lemma 5.1, we get the map M and a set S as required.

Reduction to the low triple-multiplicity case
In this section we prove a lemma that shows that, when analyzing a (3, δ )-LCC V over any field F, it is enough to consider codes in which the matchings M v , v ∈ V used in the decoding are such that each triple appears in a small number of matchings.(Otherwise we can find a large subset of low rank.)Definition 7.1 (Triple-multiplicity).We say that a (3, δ )-LCC V with matchings M v , v ∈ V has triplemultiplicity at most r if each triple in each M v appears in at most r of the matchings.
Lemma 7.2.Let F be a field, n ≥ (1/δ ) ω (1) and . Then, there exists a (3, δ /24)-LCC U ⊂ V with |U| ≥ (δ /4)n and matchings M v , v ∈ U so that U (with the matchings M v ) has triple-multiplicity at most n β and the matchings M v are subsets of the corresponding matchings M v .
Proof.We first reduce to the situation where every element participates in many triples.Unless mentioned otherwise, we will count triples with multiplicity.Let 0 < γ = δ 2 /6 be a real number.Iteratively delete vertices from V that participate in less than γn triples (counted with multiplicity), and the triples they participate in.Let B ⊆ V be the subset of deleted elements, and let V = V \ B. Since each deleted vertex only gets rid of γn triples, the total number of triples which include some vertex of B is at most γn 2 .Thus each element in V participates in at least γn triples, and at least (δ − γ)n 2 > (2δ /3)n 2 of the triples in V are supported entirely in V .Call this set of triples T .
Proof.This is because there must be some v ∈ V with at least (2δ /3)n triples in its matching that still survive in T -if this was not the case, we would have |T | < (2δ /3)n 2 .Since the triples in the matching corresponding to v are disjoint, |V | ≥ 2δ n.

LOWER BOUND FOR 3-QUERY LOCALLY CORRECTABLE CODES OVER THE REALS
Let B ⊂ V be the subset of points in V which have less than δ n/2 of the triples in their matching supported in V .Let V = V \ B .Proof.There can be at most δ n/3 elements in V such that δ n/2 triples in their matchings include an element from B-if there were more than that, then the total number of triples including a element from B would be greater than δ n/3 • δ n/2 ≥ δ 2 n 2 /6 ≥ γn 2 , which is not possible.Thus, at least |V | − δ n/3 of the elements in V have a matching of size at least δ n/2 decoding them, lying wholly within V .Thus Let T be the union of all the triples in the LCC V .
We will call a triple in T a high-multiplicity triple if it has multiplicity at least n β in T ; otherwise we will call it a low-multiplicity triple).Proof.Suppose the claim does not hold.That is, at least (δ /24)|V | of the elements in V have at least half of their matchings (in T ) composed of high multiplicity triples.
We now delete all the triples of low multiplicity from T .Since there are at least (δ 2 /288)|V | 2 triples (counting multiplicity) of multiplicity at least n β in the LCC V , by averaging, there exists v ∈ V that participates in at least (δ 2 /288)|V | triples (counted with multiplicity), and each of the triples has multiplicity at least n β .Observe that since all these triples contain v, no two triples are part of a matching corresponding to the same element.
By greedily choosing distinct triples containing v of highest multiplicity, one can pick a set T * of distinct triples of size at most n 1/2−β /2 such that together they span at least n 1/2+β /2 distinct elements of V .This is true since n 1/2+β /2 ≤ (δ 2 /288)|V |, each triple of multiplicity n β spans at least n β distinct elements and distinct triples sharing an element must span distinct elements.By "distinct" here we mean distinct LCC indices (not necessarily distinct vectors).
Let L be a linear transformation of co-rank at most 3n 1/2−β /2 which maps each element participating in a triple of T * to 0. Since all the elements spanned by the triples of T * also get mapped to 0, at least n 1/2+β /2 elements of V get mapped to 0 under L. Let this set be V * .Recall that each element of V (and hence of V * ) participates in γn triples which together decode γn distinct elements of V .
Let S ⊂ V be the subset of all elements whose matching contains at least (γ/6)n 1/2+β /2 triples that each contain some element from V * .Since the total number of triples containing some element from V * is at least |V * | • γn/3, by a simple counting argument we get that |S| ≥ (γ/6)n.
To finish the proof of Claim 7.5 we will now argue that For contradiction, assume dim(S) > 2n 1/2−β /3 , then THEORY OF COMPUTING, Volume 13 (11), 2017, pp.1-36 Moreover, since L sends V * to 0, all triples containing some element of V * now have at most 2 nonzero elements, and thus the triples can be replaced by pairs.Thus L(V ) is a (2, (γ/6)n −1/2+β /2 )-LDC of size n, decoding to linearly independent vectors spanning at least n 1/2−β /3 dimensions.Using Theorem 4.5 (lower bound for 2-query LDCs) we get that Since n ≥ (1/δ ) ω (1) , γ = poly(δ ) and β = Ω(1), this is a contradiction (for large enough n).Thus, the set S has size at least (γ/6)n = δ 2 n/36 and dimension at most n 1/2−β /4 , contradicting the assumption in Lemma 7.2.This completes the proof of Claim 7.5 Applying Claim 7.5, we see that one can delete all triples of multiplicity greater than n β and delete at most δ |V |/24 elements to get a subset U such that each element of U has a matching of δ |U|/24 triples decoding to it where the triples are supported in U. Thus U is a (3, δ /24)-LCC with |U| ≥ δ n/4, and with all triples of multiplicity at most n β .This completes the proof of Lemma 7.2.

|S
is the multiset of all triples in all matchings used to decode V , then there are at most δ 2 n 2 /100 triples in M that are not clustered by S 1 , . . ., S m .
To prove the intermediate clustering lemma we first prove a basic clustering lemma.First, we show how to use the intermediate clustering lemma to prove the final Lemma 8.2.After that, we will prove the basic clustering lemma and, using it, easily derive the intermediate clustering lemma.
Proof of Lemma 8.2.At a high level, the proof follows by first applying Lemma 6.1 to get the well-spread vectors condition on the points in a large sub-LCC V of V .Then we use Lemma 7.2 on V to get a subcode V with low triple-multiplicity.(This does not ruin the well-spread vectors condition by much.)Finally, we apply Lemma 8.3 on V to get clustering for almost all triples.The only reason why one of these steps could fail is if we found a large low dimensional subset in V (which will contradict our assumptions).A final refinement step, using Claim 4.1 shows the existence of a subcode V as required.The details follow.
Reducing to the well-spread vectors case.We apply Lemma 6.1 on V , with β = 2δ 6 , to obtain a subset V of size n ≥ (1 − δ /2)n so that V is a (3, δ = δ /2)-LCC and so that for each unit vector w ∈ R d we have If we cannot apply Lemma 6.1, it means that there is a subset U in V of size |U| ≥ (δ /4)n and dimension at most 8δ 6 d, which would contradict our assumptions.
Reducing to low triple-multiplicity.We now apply Lemma 7.2 on the LCC V to get a (3, δ /48)-LCC V ⊂ V of size n ≥ (δ /8)n and with triple-multiplicity at most (n ) β ≤ (n ) 2β .If we cannot apply the lemma, it means that there is a subset , which would contradict our assumptions.Let d = dim(V ) and δ = δ 2 /2.We can think of V as a (3, δ )-LCC over R d in which the well-spread vectors condition above can be written as for all unit vectors w ∈ R d .(We took δ = δ 2 /2 to compensate for the drop in n in the above inequality.)Notice that moving from R d to R d is not a problem since we can orthogonally project all vectors on the span of V and maintain all inner products with all unit vectors.
Clustering.We can now apply Lemma 8.3 on V to find sets S 1 , . . ., S m that cluster all but (δ 2 /100)n 2 of the triples in the decoding matchings of V .With |S i | ≤ O(n /δ 6 d ) for all i ∈ [m] and (using t = n /δ 6 d ) we obtain If we cannot apply the lemma, it means that d ≤ (1/δ ) O (1) , which would contradict our assumptions on V (since it would have a subset V of size n ≥ (δ /8)n and dimension (1/δ ) O(1) < n 1/4 ).
Refinement.To complete the proof, observe that, there are at least (1 − δ /10)n points in V that have at least half of their matchings clustered by S 1 , . . ., S m .Hence, we can use Claim 4.1 to find a (3, δ )-LCC V ⊂ V of size n ≥ (1 − δ /10)n ≥ (δ /10)n with δ ≥ δ /2 ≥ δ 2 /4 so that the sets S 1 , . . ., S m (restricted to indices in V ) cluster all the triples in the matchings of V .Notice that, since d = d , δ = Θ(δ ) and n = Θ(n ), the bounds on the sizes of the sets S i and on m still hold (the difference in constants will be swallowed by the big "O").This completes the proof of Lemma 8.2.

Preliminaries for the proof of the clustering lemmas
We denote by v the 2 norm of a vector v. Notice that for two unit vectors u and v, u We denote the correlation between two unit vectors v, u as | v, u |.Let V be as in Lemma 8.3 with matchings M v , v ∈ V .The conditions of Lemma 8.3 (which we will assume to hold for the rest of this section) tell us that for all unit vectors u ∈ R d we have This has the following useful consequence.
Claim 8.5.For every unit vector u ∈ R d we have We can also bound the number of points in V that correlate with a given plane.
Proof.For each v ∈ K let u(v) ∈ P be a unit vector with | v, u(v) | ≥ α.Now, cover the boundary of the unit circle in P with at most 20/α balls10 of radius at most α/2.By a pigeon hole argument, one of these balls must contain at least α|K|/20 of the points u(v).Now, the center of this ball must have correlation at least α/2 with all the α|K|/20 corresponding vectors v. Applying Claim 8.5 we get that |K| ≤ (80/α 3 )t.
For every unit vector u ∈ R d , let For every v ∈ V , let M * v ⊆ M v be defined as That is, let M * v be the subset of the triples decoding v where each vector in each triple has low correlation with v. Intuitively, such triples must be close to a two dimensional plane and hence "almost" dependent.
The following is an immediate corollary of Claim 8.5.
Let M * be the (multiset) union of all triples in M * v for all v ∈ V .By Claim 8.7, M * has size at least δ n 2 − 10 8 tn.
The following proposition bounds the number of triples in M * containing a fixed pair of vertices.
Proposition 8.8.For all i = j ∈ [n], there are at most O(tn β ) triples (counting multiplicities) in M * containing the pair (v i , v j ).
Proof.We will show a bound of O(t) on the number of distinct triples containing (v i , v j ).The O(tn β ) bound will then follow by our assumption on the maximum multiplicity of triples in M (and so also in M * ).Let P = span{v i , v j }.Consider a triple (v i , v j , v k ) containing v i , v j and suppose this triple belongs to some matching M * v .Let Π = span{v k , v} and observe that both planes P and Π (both are indeed planes since the property of the LCC being regular implies the distinctness of the points in a triple and the point they are used to decode to) are contained in the three dimensional subspace span{v i , v j , v k }.Therefore, they must intersect in some unit vector w ∈ P ∩ Π.Now, since | v k , v | ≤ 10 −4 , a simple calculation shows that w must have correlation at least 1/10 with either v k or v (since w belongs to their span and they are close to being orthogonal).To summarize, we have shown that in every triple (v i , v j , v k ) ∈ M v , one of the vectors v, v k has correlation at least 1/10 with the plane P. Now, the union of {v, v k } as we go over all distinct triples containing {v i , v j } is at most O(t) by Claim 8.6.If the total number of distinct triples is r, then at least r/2 of the vectors v will correlate with P or r/2 of the v k will correlate with P. In either case we see that r/2 = O(t), and hence r = O(t).Definition 8.9 (Triple types).We split the triples appearing in M * into two types.
is defined to be of Type A if there exists a pair of vertices in the triple, say When we refer to a triple as Type A or Type B, we will implicitly assume that this triple is in M * .We first state and prove three simple propositions that will be useful in the proof of the basic clustering lemma.Below, we will sometimes refer to the elements of V as "vertices."The reader not wishing to follow the somewhat tedious calculations can recall the high level overview given in Section 3. Proposition 8.10.Let (v i , v j , v k ) be a triple of Type B then either Suppose the triple decodes to the vector u and by an appropriate orthogonal change of basis (which does not change distances or inner products), let us assume that the vectors all lie in the 3 dimensional space spanned by the unit vectors e 1 , e 2 and e 3 .We can also assume that u = e 1 , v i is a linear combination of e 1 and e 2 , and v j and v k are linear combinations of e 1 , e 2 and e 3 .
Since the vectors in the triple are uncorrelated to u, their inner product with e 1 has absolute value at most 1/10 4 .Since v i is a unit vector, v i , e 1 2 + v i , e 2 2 = 1 and hence . Proposition 8.11.Suppose T is a set of m distinct triples of Type B, each sharing the pair (v i , v j ).Let S be the set of size m containing all the vertices of the triples in T except v i and v j .Then there is a ball of radius at most 5/10 4 containing at least m/10 5 points of S.
Proof.We will first show that every point of S is close to the subspace through v i and v j , and then apply a pigeon hole argument.
Let v k ∈ S. Then (v i , v j , v k ) is a triple of Type B, and in particular the triple is in M * u for some vertex u.
By an appropriate orthogonal change of basis (which does not change distances or inner products), we can assume that the vectors all lie in the 3 dimensional space spanned by the unit vectors e 1 , e 2 and e 3 .We can also assume that v i = e 1 , v j is a linear combination of e Now consider the unit circle C in the subspace spanned by e 1 and e 2 .We will show that each element of S is at distance at most 4/10 4 from C. To see this, observe that for v k ∈ S, the projection vk of v k onto the subspace spanned by e 1 and e 2 is of length at least 1 − 2/10 4 (by the triangle inequality).Thus vk is at distance at most 2/10 4 from C and also at distance at most 2/10 4 from v k .Thus again by the triangle inequality, the distance between v k and C is at most 4/10 4 .Now cover C with 10 5 2-dimensional discs of radius 1/10 4 .Clearly this can be done.Thus each element v k in S is at distance at most 5/10 4 from the center of one of these discs.Thus for one of these discs, there are m/10 5 points of S that are at distance at most 5/10 4 from the center of the disc.Proposition 8.12.Let G be an edge-weighted k-uniform hypergraph on n vertices with k ≥ 2. Define the degree of a vertex to be the sum of the weights of all hyperedges containing it.Suppose the average degree of a vertex in G is D.Then, there exists a vertex induced subgraph G of G in which every vertex has degree at least D/k.Proof.To obtain G we iteratively delete vertices whose degree in G is less than D/k.Observe that, after each deletion, the average degree in the hypergraph strictly increases.Indeed, after removing a vertex of degree D < D/k, the new average is Thus the process must terminate when all vertices have degree at least D/k.

Basic clustering: proof of Lemma 8.4
At this point, we assume that V is well spread over the unit sphere, has low-multiplicity triples that are nearly orthogonal to their associated vectors in V .Each triple is either of Type A-containing a very correlated pair-or of Type B-consisting of uncorrelated vectors.
We first show that having many triples of the same type implies that we can find a small set of vertices such that many of the triples intersect the set in at least two of their elements.This will be the main step in the proof of Lemma 8.4 which is given below.Recall that we have an upper bound of n β on the multiplicity of each triple in M * .Lemma 8.13.Suppose there is a subset T of γn 2 triples (counting multiplicities) in M * of the same type (either Type A or Type B), then there is a set S ⊆ V such that |S| = O(t), and at least Ω(γ 2 n 2−β /t) triples in T intersect S in at least two of their elements.
Proof.We separate into two cases according to the type of the triples in T .In both cases, we will first refine to the situation where every vertex is incident to many (γn) triples.In both cases we will find a cluster V * of nearby vertices, and let S be some kind of neighborhood of V * such that every triple which intersects V * will also intersect S in two elements.Since V * will be incident to many triples, we will conclude that many triples intersect S in two elements.Moreover we will ensure that every vertex in S will have some constant correlation with some fixed carefully chosen vertex w.Since every element in S correlates with vertex w, Claim 8.5 implies that S cannot be too large.In the case of Type A triples, the argument is fairly straightforward, whereas in the case of Type B triples the argument is more delicate.THEORY OF COMPUTING, Volume 13 (11), 2017, pp.1-36 Case 1: T has only triples of Type A. Consider the following weighted graph H on vertex set V in which the edges are all pairs v i , v j with | v i , v j | ≥ 9/10 and the weight of an edge (v i , v j ) is the number of triples in T , counting multiplicities, that contain this pair (we can discard edges of weight zero).We define the degree of a vertex deg(v) as the sum of weights over all edges of H that contain v. Since (1/2) ∑ v deg(v) ≥ |T | we have that the average degree in H is at least D = 2|T |/n ≥ 2γn.
Let H be a vertex induced subgraph of H in which every vertex has degree at least D/2 (such a subgraph exists by Proposition 8.12).Let w be any vertex in H and observe that, by Proposition 8.8, w must have at least r = Ω(γn 1−β /t) distinct neighbors u 1 , . . ., u r (since the maximal weight of an edge is O(tn β )).Let V * = {u 1 , . . ., u r }.We define the set S to contain these vertices u 1 , . . ., u r ∈ V * as well as all of their neighbors.
First, we argue that S cannot be too large.To see this, observe that, if (v i , v j ) is an edge in H then v j must have 2 distance at most 1/ √ 5 from either v i or −v i .Thus, since all vertices in S are at (graph) distance less than two from w, we have that they are all contained in the union of two balls of radius 2/ √ 5 around w and around −w.This means that all points in S must have correlation at least 4/6 with w.Using Claim 8.5 we get that |S| ≤ O(t).
To see that there are many triples with two elements in S observe that the sum over all weights of edges touching u 1 , . . ., u r is at least r • γn ≥ Ω(γ 2 n 2−β /t) (using the fact that H has high minimum degree).Since every triple is counted at most 3 times in this sum we conclude that there are at least Ω γ 2 n 2−β /t triples with a pair in S. Recall that every point v k in V * is incident to at least γn/3 triples lying within G , and, by Proposition 8.10, for each of the triples there exists a vertex v k distinct from v k in that triple such that 100} be the set of all vertices that have correlation at least 1/100 with some vertex of V * .Fix w ∈ V * .Then for any u ∈ S, by definition of S, there exists w ∈ V * such that u, w > 1/100.Also, since radius of V * is at most 5/10 4 we have w − w ≤ 1/10 3 .Together, these imply that | u, w | > 1/10 3 .Since this holds for all u ∈ S (and for the same fixed w), by Claim 8.5 we get that |S| < 10 6 t.
Moreover, observe that each triple that intersects V * must intersect S in two elements.Since each tripe in V * is incident to at least γn/3 triples, and each triple is counted at most 3 times, there must be at least triples with a pair in S.
and so that all triples in T 1 are clustered.We now let M = M \ T 1 and continue in this manner to generate S 2 , S 3 , . . ., S m and (disjoint) T 2 , T 3 , . . ., T m , removing the triples in the T i from M as we proceed, until there are at most δ 2 n 2 /100 triples in M that are not clustered.
This only leaves the task of bounding the number of iterations, m.The upper bound follows from the fact that the sets T i are disjoint, each of size at least Ω(δ 4 n 2−β /t) and that |M| ≤ δ n 2 .The lower bound follows from the observation that, by Proposition 8.8, each T i can have size at most ).Since the union of the T i contains at least Ω(|M|) ≥ Ω(δ n 2 ) triples, we get that m ≥ Ω(δ n 2−β /t 3 ).This completes the proof of Lemma 8.3.

Clustering implies large low-rank subset
The main result of this section is the following lemma that gives a dimension upper bound for LCCs in which the triples are "clustered."An informal statement of this lemma was given as Lemma 9.1.
Notice that this lemma works over any field F. Then there is a subset V ⊂ V of size |V | ≥ (δ /2)n and rank at most n 1/2−ε .
This lemma will be an easy corollary to the following lemma, which shows that there is a small subset in V so that, when projecting this set to zero, the dimension of V drops by a lot.Lemma 9.2 (Restriction lemma).Let n, β , ε, V and S 1 , . . ., S m satisfy the conditions of Lemma 9.1.Assume further that the matchings M v are in regular form (no "2-query" triples).If d > n 1/2−ε then there exists a subset U ⊂ V with We prove the Restriction lemma (Lemma 9.2) below, following the short proof of Lemma 9.1 from Lemma 9.2.
Proof of Lemma 9.1.Using Claim 4.7 we can reduce to the case that the code V and the matchings M v are in regular form (that is, there are no "2-query" triples).Indeed, replacing V with the code given in Claim 4.7 leaves us with a new code (with n and δ the same up to a constant) satisfying the same clustering requirements (using the same sets S 1 , . . ., S m ) and with the same dimension.If we cannot apply Claim 4.7 it is because there is a subset U ⊂ V of size (δ /2)n and dimension at most O((1/δ ) log(n)) < n 1/2−ε , in which case the proof is done.
Next, suppose in contradiction that d > n 1/2−ε (otherwise we let V = V ).Apply Lemma 9.2 to get a subset U ⊂ V with |U| ≤ n 1/4+7ε , such that, if we send U to zero by a linear map, the dimension of span{V } goes down to at most n 10ε .The existence of such a U implies that which gives a contradiction if ε < 1/50.

Proof of Lemma 9.2
Using the assumption d > n 1/2−ε we get that for each i ∈ [m], and the number of sets, m, is between THEORY OF COMPUTING, Volume 13 (11), 2017, pp.1-36 Claim 9.6.Let i ∈ [m], A ⊂ S i and let f A,i be defined as above.If L : F d → F d is any linear map sending A to zero, then L(v) ∼ L( f A,i (v)) for all v for which f A,i (v) = ⊥ .
Proof.If f A,i (v) = ⊥ then there is a triple (x, y, z) ∈ M v with x, y ∈ A and f A,i (v) = z.Since v ∈ span{x, y, z} we get that L(v) ∈ span{L(x), L(y), L(z)} = span{L( f A,i (v))}.Similarly, since we are assuming that v is not in the span of x, y (since the matchings M v are in regular form), z is in the span of v, x, y and so L(z) ∈ span{L(v)}.
Proof.By Observation 9.5, it is enough to analyze the probability for the distribution µ.Fixing v ∈ V we call a set S i heavy if it contains at least n 1/2−2ε pairs from P v (recall Claim 9.3).Since we are choosing each element of S i with probability n −1/4+ε , the probability to "miss" a single pair from P v is exactly (1 − n −1/2+2ε ).If S i is heavy, then (using the fact that P v is a matching) the probability that A contains at least one of the pairs in P v is at least We now bound from below the probability that S i is heavy.Recall that |P v | ≥ δ n and that m ≤ O(δ −10 n 1/2+ε+β ).Let m h + m = m so that m h is the number of heavy sets S i .Since each S i can contain at most |S i |/2 = O(δ −6 n 1/2+ε ) disjoint pairs, we have that This implies (since β < ε/2) that m h ≥ Ω(δ 7 n 1/2−ε ) .
Combining the above two bounds, we get that the probability of picking a heavy cluster and then picking some pair in P v is at least Ω(δ 17 n −3ε ).THEORY OF COMPUTING, Volume 13 (11), 2017, pp.1-36 Proof.By Observation 9.5, it is enough to analyze the probability for the distribution µ.Suppose z appears in a triple (u, w, z) ∈ M v that is associated with S î for some î ∈ [m].(If there is no such î then the probability in question is equal to zero.)By our definition of the functions f A,i , it is only possible for f A,i (v) = z to hold if i = î and both u and w are chosen to be in the set A ⊂ S î.The probability to pick i = î is 1/m ≤ O(δ −19 n −1/2+3ε+β ).Now, conditioned on picking this event, the probability of picking both u and w to be in A is n −1/2+2ε .Multiplying, and using the bound β < ε/2, we get the required bound.Claim 9.9.Let (A, i) ∼ µ and let B ⊂ V be a set with |B| ≤ n 1−10ε .Then, for every v ∈ V , Then, by Claims 9.7 and 9.8, we have Rearranging, and using the fact that n ≥ (1/δ ) ω (1) , we get that p ≥ Ω(δ 17 n −3ε ).
The set U. To define the set U required in Lemma 9.2, we proceed as follows.Let r be an integer to be determined later, and pick r sets A 1 , . . ., A r ⊂ V and r indices i 1 , . . ., i r ∈ [m] so that each (A j , i j ) is sampled independently according to the distribution µ .Let U = r j=1 A j .Let f 1 = f A 1 ,i 1 , . . ., f r = f A r ,i r be the corresponding (partial) functions on V .Our goal is to show that, with probability greater than zero, setting U to zero by a linear map, reduces the dimension of V to n 10ε .
We begin by defining a sequence of undirected graphs H 0 , H 1 , . . ., H r on vertex set V which will depend on the choice of the sets A 1 , . . ., A r .The first graph H 0 is the empty graph (containing no edges).We define H j inductively by adding to H j−1 all edges of the form (v, f j (v)) over all v ∈ V .For j = 1, . . ., r, let k j denote the number of connected components of H j .Claim 9.10.If L : F d → F d is any linear map sending U to zero, then span{L(V )} has dimension at most k r .
Proof.This is an easy corollary of Claim 9.6.If L(U) = 0 then, for every edge (x, y) in H r , we have L(x) ∼ L(y).Since the relation ∼ is transitive, each connected component is contained in a one dimensional subspace after applying L.
This completes the proof of Theorem 2.3.
10.1 Proof of Lemma 10.2 Observe that for v ∈ V \ S, all 3 points of any triple in M v cannot be in S since span V (S) = S. Thus we may assume that |S| ≤ (1 − δ )n, since otherwise each vector in V \ S would be spanned by the points of S and we would be done.If Case 1 does not hold then each v ∈ V \ S, M v has 3δ n/4 of its triples intersecting S in either one or zero points.Let us call a point v type-zero if if has at least 3δ n/8 of its triples contained in V \ S and type-one otherwise.Notice that, if v is type-one, then it must have at least 3δ n/8 of its triples intersecting S in exactly one point.We now separate into two additional cases.Setting S = S ∪U we are done.
Case 3: There are at least δ n/4 type-one points.In this case, there are δ n/4 points v in V \ S, each having at least 3δ n/8 of the triples in M v intersecting S in exactly one point.Let A be a linear transformation whose kernel equals span(S).After applying A to V \ S we obtain a (2, 3δ /4) LDC decoding the δ n/4 type-one points.Thus the δ n/4 points (after we apply the mapping A to them) must span at most poly(1/δ ) log n ≤ max{δ 6 d, n 1/2−ε/16 } dimensions by Theorem 4.5.Thus, adding them to S will increase the dimension of its span by at most this number.This completes the proof also in this case.

Theorem 6 . 4 .
Let V = (v 1 , . . ., v n ) ∈ (R d ) n with dim(V ) = d be so that any subset U ⊂ V of size |U| ≥ αn has dim(U) ≥ β d.Then, there exists an invertible linear map M : R d → R d and a subset S ⊂ V of size |S| ≥ (1 − 2α)n so that, if we denote by v = Mv/ Mv , we have for all unit vectors w ∈ R d ∑ v∈S v, w 2 ≤ 4n β d .

Lemma 8 . 4 (
Basic Clustering).Let n,t, β , δ and V ∈ (R d ) n be as in Lemma 8.3 and let M be the multiset of triples obtained by taking the union of all M v , v ∈ V .Let M ⊂ M be of size at least δ 2 n 2 /100 and suppose that d > 10 8 • 200/δ 8 .Then there exists a subset S ⊂ V with |S| ≤ O(t) and a subset T ⊂ M with |T | ≥ Ω(δ 4 n 2−β /t) such that each triple in T contains at least two elements from S.

Case 2 :
T has only triples of Type B. Consider the following 3-regular weighted hypergraph G.The set of vertices of G is the set V .For each triple (v i , v j , v k ) ∈ T we have a hyper-edge in G with weight equal to the multiplicity of that triple in T .By Proposition 8.12, there is a subgraph G of G such that every vertex of G is incident to at least γn/3 triples (counting weights) lying within G .Pick any vertex v ∈ G .Let C v be the multiset {v ∈ V | | v, v | > 1/100} of vectors that are slightly correlated with v.By Claim 8.5 (well-spread vectors condition) we have |C v | < t •10 4 .By Proposition 8.10, every triple containing v has another vertex v such that | v, v | > 1/100 (and thus v ∈ C v ).Thus by a simple averaging argument, it must be that for some v ∈ C v , the pair (v, v ) participates in at least γn/(3|C v |) triples (counting multiplicities).Using the bound on triple-multiplicity, we get that there is a set T * of at least γn/(3|C v |n β ) distinct triples containing v and v .Thus |T * | ≥ γn 3|C v |n β ≥ γn 1−β 10 4 • t and, by Proposition 8.11, at least |T * |/10 5 vertices (of G ) lie in a ball of radius 5/10 4 .Call this set of vertices V * .Thus what we have so far is a set V * of vertices of G all lying in a ball of radius 5/10 4 , where |V * | ≥ γn 1−β 3 • 10 9 • t .

Case 1 :
There exists a v ∈ V \ S such that δ n/4 of the triples in M v have two of their points contained in S. In this case let S = span V ({v} ∪ S).Then |S | ≥ |S| + (δ /4)n, and dim(S ) ≤ dim(S) + 1.