Arithmetic Circuits with Locally Low Algebraic Rank

In recent years, there has been a flurry of activity towards proving lower bounds for homogeneous depth-4 arithmetic circuits, which has brought us very close to statements that are known to imply $\textsf{VP} \neq \textsf{VNP}$. It is open if these techniques can go beyond homogeneity, and in this paper we make some progress in this direction by considering depth-4 circuits of low algebraic rank, which are a natural extension of homogeneous depth-4 circuits. A depth-4 circuit is a representation of an $N$-variate, degree-$n$ polynomial $P$ as \[ P = \sum_{i = 1}^T Q_{i1}\cdot Q_{i2}\cdot \cdots \cdot Q_{it} \; , \] where the $Q_{ij}$ are given by their monomial expansion. Homogeneity adds the constraint that for every $i \in [T]$, $\sum_{j} \operatorname{deg}(Q_{ij}) = n$. We study an extension, where, for every $i \in [T]$, the algebraic rank of the set $\{Q_{i1}, Q_{i2}, \ldots ,Q_{it}\}$ of polynomials is at most some parameter $k$. Already for $k = n$, these circuits are a generalization of the class of homogeneous depth-4 circuits, where in particular $t \leq n$ (and hence $k \leq n$). We study lower bounds and polynomial identity tests for such circuits and prove the following results. We show an $\exp{(\Omega(\sqrt{n}\log N))}$ lower bound for such circuits for an explicit $N$ variate degree $n$ polynomial family when $k \leq n$. We also show quasipolynomial hitting sets when the degree of each $Q_{ij}$ and the $k$ are at most $\operatorname{poly}(\log n)$. A key technical ingredient of the proofs, which may be of independent interest, is a result which states that over any field of characteristic zero, up to a translation, every polynomial in a set of polynomials can be written as a function of the polynomials in a transcendence basis of the set. We combine this with methods based on shifted partial derivatives to obtain our final results.


Introduction
Arithmetic circuits are natural algebraic analogues of Boolean circuits, with the logical operations being replaced by sum and product operations over the underlying field. Valiant [44] developed the complexity theory for algebraic computation via arithmetic circuits and defined the complexity classes VP and VNP as the algebraic analogs of complexity classes P and NP respectively. We refer the interested reader to the survey by Shpilka and Yehudayoff [42] for more on arithmetic circuits.
Two of the most fundamental questions in the study of algebraic computation are the questions of polynomial identity testing(PIT) 1 and the question of proving lower bounds for explicit polynomials. It was shown by structural results known as depth reductions [2,24,43] that strong enough lower bounds or PIT results for just (homogeneous) depth-4 circuits, would lead to superpolynomial lower bounds and derandomized PIT for general circuits too. Consequently, depth-4 arithmetic circuits have been the focus of much investigation in the last few years.
Just in the last few years, we have seen rapid progress in proving lower bounds for homogeneous depth-4 arithmetic circuits, starting with the work of Gupta et al. [13] who proved exponential lower bounds for homogeneous depth-4 circuits with bounded bottom fan-in and terminating with the results of Kayal et al. [18] and of the authors of this paper [29], which showed exponential lower bounds for general homogeneous depth-4 circuits. Any asymptotic improvement in the exponent of these lower bounds would lead to superpolynomial lower bounds for general arithmetic circuits. 2 Most of this progress was based on an understanding of the complexity measure of the family of shifted partial derivatives of a polynomial (this measure was introduced by Kayal [17]), and other closely related measures.
Although we now know how to use these measure to prove such strong lower bounds for homogeneous depth 4 circuits, the best known lower bounds for non-homogeneous depth three circuits over fields of characteristic zero are just cubic [41,39,21], and those for non-homogeneous depth-4 circuits over any field except F 2 are just about superlinear [33]. It remains an extremely interesting question to get improved lower bounds for these circuit classes.
In sharp contrast to this state of knowledge on lower bounds, the problem of polynomial identity testing is very poorly understood even for depth three circuits. Till a few years ago, almost all the PIT algorithms known were for extremely restricted classes of circuits and were based on diverse proof techniques (for instance, [7,23,15,22,14,37,38,36,1,10,30]). The paper by Agrawal et al. [1] gave a unified proof of several of them.
It is a big question to go beyond homogeneity (especially for proving lower bounds) and in this paper we make progress towards this question by considering depth-4 circuits of low algebraic rank, 3 which are a natural extension of homogeneous depth-4 arithmetic circuits.
A depth-4 circuit is a representation of an N-variate, degree-n polynomial P as where the Q i j are given by their monomial expansion. Homogeneity adds the constraint that for every i ∈ [T ], ∑ j deg(Q i j ) = n. We study an extension where, for every i ∈ [T ], the algebraic rank of the set {Q i1 , Q i2 , . . . , Q it } of polynomials is at most some parameter k. We call this the class of ΣΠ (k) ΣΠ circuits. Already for k = n, these circuits are a strong generalization of the class of homogeneous depth-4 circuits, where in particular t ≤ n (and hence k ≤ n). We prove exponential lower bounds for ΣΠ (k) ΣΠ circuits for k ≤ n and give quasipolynomial time deterministic polynomial identity tests for ΣΠ (k) ΣΠ circuits when k and the bottom fan-in are bounded by Poly(log N). All our results actually hold for a more general class of circuits, where the product gates at the second level can be replaced by an arbitrary circuits whose inputs are polynomials of algebraic rank at most k. In particular, our results hold for representations of a polynomial P as where, for every i ∈ [T ], C i is an arbitrary polynomial function of t inputs, and the algebraic rank of the set {Q i1 , Q i2 , . . . , Q it } of polynomials is at most some parameter k.

Some background and motivation
Before we more formally define the model and state our results, we give some background and motivation for studying this class of circuits.
Strengthening of the model of homogeneous depth-4 circuits. As already mentioned, we know very strong exponential lower bounds for homogeneous depth-4 arithmetic circuits. In contrast, for general (non-homogeneous) depth-4 circuits, we know only barely superlinear lower bounds, and it is a challenge to obtain improved bounds. ΣΠ (k) ΣΠ circuits with k as large as n (the degree of the polynomial being computed), which is the class we study in this paper, is already a significant strengthening of the model of homogeneous depth-4 circuits (since the intermediate degrees could be exponentially large). We provide exponential lower bounds for this model. Note that when k = N, ΣΠ (k) ΣΠ circuits would capture general depth-4 arithmetic circuits.
Low algebraic rank and lower bounds. In a recent paper, Agrawal et al. [1] studied the notion of circuits of low algebraic rank and by using the Jacobian to capture the notion of algebraic independence, they were able to prove exponential lower bounds for a certain class of arithmetic circuits. 4 They showed that over fields of characteristic zero, for any set {Q 1 , Q 2 , . . . , Q t } of polynomials of sparsity at most s and algebraic rank k, any arithmetic circuit of the form C(Q 1 , Q 2 , . . . , Q t ) which computes the determinant polynomial for an n × n symbolic matrix must have s ≥ exp (n/k). Note that if k = Ω(n), then the lower bound becomes trivial. The lower bounds in this paper strengthen these results in two ways.
1. Our lower bounds hold for a (potentially) richer class of circuits. In the model considered by [1], one imposes a global upper bound k on the rank of all the Q i feeding into some polynomial C. In our model, we can take exponentially many different sets of polynomials Q i , each with bounded rank, and apply some polynomial function to each of them and then take a sum.
2. Our lower bounds are stronger-we obtain exponential lower bounds even when k is as large as the degree of the polynomial being computed.
Algebraic rank and going beyond homogeneity. Even though we know exponential lower bounds for homogeneous 5 depth-4 circuits, the best known lower bounds for non-homogeneous depth-4 circuits are barely superlinear [33]. Grigoriev-Karpinski [11], Grigoriev-Razborov [12] and Shpilka-Wigderson [41] outlined a program based on "rank" to prove lower bounds for arithmetic circuits. They used the notion of "linear rank" and used it to prove lower bounds for depth-3 arithmetic circuits in the following way. Let C = ∑ T i=1 ∏ t j=1 L i j be a depth three (possibly nonhomogeneous) circuit computing a polynomial P of degree-n. Now, partition the inputs to the top sum gate to two halves, C 1 and C 2 based on the rank of the inputs feeding into it in the following way. For each i ∈ [T ], if the linear rank of the set {L i j : j ∈ [t]} of polynomials is at most k (for some threshold k), then include the gate i into the sum C 1 , else include it into C 2 . Therefore, 4 Even more significantly they also give efficient PIT algorithms for the same class of circuits. 5 These results, in fact, hold for depth-4 circuits with not-too-large formal degree. THEORY OF COMPUTING, Volume 13 (6), 2017, pp. 1-33 Their program had two steps.
1. Show that the subcircuit C 1 is weak with respect to some complexity measure, and thus prove a lower bound for C 1 (and hence C) when C 2 is trivial.
2. Also since C 2 is "high rank," show that there are many inputs for which C 2 is identically zero. Then try to look at restrictions over which C 2 is identically zero, and show that the lower bounds for C 1 continue to hold.
The following is the natural generalization of this approach to proving lower bounds for depth-4 circuits. Let C = ∑ T i=1 ∏ t j=1 Q i j be a depth-4 circuit computing a polynomial P of degree-n. Note that in general, the formal degree of C could be much larger than n. Now, we partition the inputs to the top sum gate to two halves, C 1 and C 2 based on the algebraic rank of the inputs feeding into it in the following way. For each i ∈ [T ], if the algebraic rank of the set {Q i j : j ∈ [t]} of polynomials is at most k (for some threshold k), then we include the gate i into the sum C 1 else we include it into C 2 . Therefore, To implement the G-K, G-R and S-W program, as a first step one would show that the subcircuit C 1 is weak with respect to some complexity measure, and thus prove a lower bound for C 1 (and hence C) when C 2 is trivial. The second step would be to try to look at restrictions over which C 2 is identically zero, and show that the lower bounds for C 1 continue to hold.
For the case of depth-4 circuits, even the first step of proving lower bounds when C 2 is trivial was not known prior to this work (even for k = 2). Our results in this paper are an implementation of this first step, as we prove exponential lower bounds when the algebraic rank of inputs into each of the product gates is at most n (the degree of the polynomial being computed).
Connections to divisibility testing. Recently, Forbes [9] showed that given two sparse multivariate polynomials P and Q, the question of deciding if P divides Q can be reduced to the question of polynomial identity testing for ΣΠ (2) ΣΠ circuits. This question was one of the original motivations for this paper. Although we are unable to answer this question in general, we make some progress towards it by giving a quasipolynomial identity tests for ΣΠ (k) ΣΠ circuits when the various Q i j feeding into the circuit have degree bounded by Poly(log N) (and we are also able to handle k as large as Poly(log N)).
Low algebraic rank and PIT. Two very interesting PIT results which are also very relevant to the results in this paper are those of Beecken et al. [3] and those of Agrawal et al. [1]. The key idea explored in both these papers is that of algebraic independence. Together, they imply efficient deterministic PIT for polynomials which can be expressed in the form C(Q 1 , Q 2 , . . . , Q t ), where C is a circuit of polynomial degree and Q i s are either sparse polynomials or product of linear forms, such that the algebraic rank of {Q 1 , Q 2 , . . . , Q t } is bounded. 6 This approach was extremely powerful as Agrawal et al. [1] demonstrate that they can use this approach to recover many of the known PIT results, which otherwise had very different proofs techniques. The PIT results of this paper hold for a variation of the model just described and we describe it in more detail in Section 1.3.3. 6 See Section 2 for definitions. THEORY OF COMPUTING, Volume 13 (6), 2017, pp. 1-33 Polynomials with low algebraic rank. In addition to potential applications to arithmetic circuit complexity, it seems an interesting mathematical question to understand the structure of a set of algebraically dependent polynomials. In general, our understanding of algebraic dependence is not as clear as our understanding of linear dependence. For instance, we know that if a set of polynomials is linearly dependent, then every polynomial in the set can be written as a linear combination of the polynomials in the basis. However, for higher degree dependencies (linear dependence is dependency of degree-1), we do not know any such clean statement. As a significant core of our proofs, we prove a statement of this flavor in Lemma 1.10.
We now formally define the model of computation studied in this paper, and then state and discuss our results.

Model of computation
We start with the definition of algebraic dependence. See Section 2 for more details.
A maximal subset of Q which is algebraically independent is said to be a transcendence basis of Q and the size of such a set is said to be the algebraic rank of Q.
It is known that algebraic independence satisfies the Matroid property [31], and therefore the algebraic rank is well defined. We are now ready to define the model of computation. Definition 1.2. Let F be any field. A ΣΠ (k) ΣΠ circuit C in N variables over F is a representation of an N-variate polynomial as for some t, T such that for each i ∈ [T ], the algebraic rank of the set {Q i j : j ∈ [t]} of polynomials is at most k. Additionally, if for every i ∈ [T ] and j ∈ [t], the degree of Q i j is at most d, we say that C is a We will state all our results for ΣΠ (k) ΣΠ and ΣΠ (k) ΣΠ [d] circuits. However, the results in this paper hold for a more general class of circuits where the product gates at the second level can be replaced by arbitrary polynomials. This larger class of circuits will be crucially used in our proofs and we define it formally below. Definition 1.3. Let F be any field. A ΣΓ (k) ΣΠ circuit C in N variables over F is a representation of an N-variate polynomial as THEORY OF COMPUTING, Volume 13 (6), 2017, pp. 1-33 for some t, T such that Γ i is an arbitrary polynomial in t variables, and for each i ∈ [T ], the algebraic rank of the set {Q i j : j ∈ [t]} of polynomials is at most k. Additionally, if for every i ∈ [T ] and j ∈ [t], the degree of Q i j is at most d, we say that C is a ΣΓ (k) ΣΠ [d] circuit. Definition 1.4 (Size of a circuit). The size of a ΣΠ (k) ΣΠ or a ΣΓ (k) ΣΠ circuit C is defined as the maximum of T and the number of monomials in the set Here for a polynomial Q, Support(Q) is the set of all monomials which appear with a non-zero coefficient in Q.
(where P is the polynomial being computed) 7 is the class of homogeneous depth-4 circuits. If we drop the condition of homogeneity, then in general the value of t could be much larger than deg(P) and the degrees of the Q i j could be much larger than deg(P). Thus, the class of ΣΠ (k) ΣΠ circuits with k equaling the degree of the polynomial being computed could potentially be a larger class of circuits compared to that of homogeneous depth-4 circuits.
Also note that in the definition of ΣΠ (k) ΣΠ circuits, the bound on the algebraic rank is local for each i ∈ [T ], and in general, the algebraic rank of the entire set {Q i j : i ∈ [T ], j ∈ [t]} can be as large as N.

Our results
We now state our results and discuss how they relate to other known results.

Lower bounds
As our first result, we give exponential lower bounds on the size of ΣΠ (k) ΣΠ circuits computing an explicit polynomial when the algebraic rank (k) is at most the degree (n) of the polynomial being computed. Theorem 1.5. Let F be any field of characteristic zero. 8 There exists a family {P n } of polynomials in VNP, such that P n is a polynomial of degree-n in N = n O(1) variables with 0, 1 coefficients, and for any ΣΠ (k) ΣΠ circuit C, if k ≤ n and if C computes P n over F, then 7 Observe that in this case, k ≤ t ≤ deg(P). 8 Sufficiently large characteristic suffices. THEORY OF COMPUTING, Volume 13 (6), 2017, pp. 1-33 Remark 1.6. From our proofs it follows that our lower bounds hold for the more general class of ΣΓ (k) ΣΠ circuits, but for the sake of simplicity, we state our results in terms of ΣΠ (k) ΣΠ circuits. We believe it is likely that the lower bounds also hold for a polynomial in VP and it would be interesting to know if this is indeed true. 9 Remark 1.7. Even though we state Theorem 1.5 for k ≤ n, the proof goes through as long as k is any polynomial in n and N is chosen to be an appropriately large polynomial in n.

Comparison to known results
As we alluded to in the introduction, ΣΠ (k) ΣΠ circuits for k ≥ n subsume the class of homogeneous depth-4 circuits. Therefore, Theorem 1.5 subsumes the lower bounds for homogeneous depth-4 circuits [18,29] for sufficiently large characteristic. Moreover, it also subsumes and generalizes the lower bounds of Agrawal et al. [1] since their lower bounds hold only if the algebraic rank of the entire set {Q i j : i ∈ [T ], j ∈ [t]} of polynomials is bounded, while for Theorem 1.5, we only need upper bounds on the algebraic rank separately for every i ∈ [T ].

Polynomial identity tests
We show that there is a quasipolynomial size hitting set for all polynomials P ∈ ΣΠ (k) ΣΠ [d] for bounded d and k. More formally, we prove the following theorem. We now mention some remarks about Theorem 1.8.
Remark 1.9. It follows from our proof that the hitting set works for the more general class of ΣΓ (k) ΣΠ [d] circuits with d, k ≤ log N, size Poly(N) and formal degree at most Poly(N).

Comparison to known results
The two known results closest to our PIT result are the results of Forbes [9] and the results of Agrawal et al. [1]. Forbes [9] studies PIT for the case where the number of distinct inputs to the second level product gates in a depth-4 circuit with bounded bottom fan-in also bounded (which naturally also bounds the algebraic rank of the inputs), and constructs quasipolynomial-size hitting sets for this case. On the other hand, we handle the case where there is no restriction on the number of distinct inputs feeding into the 9 More on this in Section 6. 10 Sufficiently large characteristic suffices. THEORY OF COMPUTING, Volume 13 (6), 2017, pp. 1-33 second level product gates, but we need to bound the bottom fan-in as well as the algebraic rank. In this sense, the results in this paper are a generalization of the results of Forbes [9].
Agrawal et al. [1] give a construction of polynomial-size hitting sets in the case when the total algebraic rank of the set {Q i j : i ∈ [T ], j ∈ [t]} is bounded, but they can work with unbounded d. On the other hand, the size of our hitting set depends exponentially on d, but requires only local algebraic dependencies for every i ∈ [T ]. So, these two results are not comparable, although there are similarities in the sense that both of them aim to use the algebraic dependencies in the circuit. In general, summation is a tricky operation with respect to designing PIT algorithms (as opposed to multiplication), so it is not clear if the ideas in the work of Agrawal et al. [1] can be somehow adapted to prove Theorem 1.8.

From algebraic dependence to functional dependence
Our lower bounds and PIT results crucially use the following lemma, which (informally) shows that over fields of characteristic zero, up to a translation, every polynomial in a set of polynomials can be written as a function of the polynomials in transcendence basis. 11 We now state the lemma precisely. Lemma 1.10 (Algebraic dependence to functional dependence). Let F be any field of characteristic zero or sufficiently large positive characteristic. Let Q = {Q 1 , Q 2 , . . . , Q t } be a set of polynomials in N variables such that the algebraic rank of Q equals k.
. . , Q k } be a maximal algebraically independent subset of Q. Then, there exists an a = (a 1 , a 2 , . . . , a N ) in F N and polynomials F k+1 , F k+2 , . . . , F t in k variables such that ∀i ∈ {k + 1, k + 2, . . . ,t} Here, for any polynomial P, we use Hom ≤i [P] to refer to the sum of homogeneous components of P of degree at most i. 12 Even though the lemma seems a very basic statement about the structure of algebraically dependent polynomials, to the best of our knowledge this was not known before. The proof builds upon a result on the structure of roots of multivariate polynomials by Dvir et al. [8]. Observe that for linear dependence, the statement analogous to that of Lemma 1.10 is trivially true. We believe that this lemma might be of independent interest (in addition to its applications in this paper).
In fact, the lemma holds for a random choice of the vector a chosen uniformly from a large enough grid in F N . Remark 1.11. In a recent result, Pandey et al. [32] show that this connection between algebraic dependence and functional dependence continues to hold over fields of small characteristic. Consequently, they show that the results of this paper also hold over fields of small characteristic. 11 A transcendence basis of a set of polynomials is a maximal subset of the polynomials with the property that its elements are algebraically independent. For more on this see Section 2. 12 For a more precise definition see Definition 2.2.

Proof overview
Even though the results in this paper seem related to the results in [1] (both exploiting some notion of low algebraic rank), the proof strategy and the way algebraic rank is used are quite different. We now briefly outline our proof strategy. We first discuss the overview of proof for our lower bound. Let P n be the degree-n polynomial we want to compute, and let C be a ΣΠ (k) ΣΠ circuit computing it, with k = n. Then C can be represented as From definitions, we know that for every i ∈ [T ], the algebraic rank of the set {Q i1 , Q i2 , . . . , Q it } of polynomials is at most k(= n). We want to give a lower bound on the size of C.
Instead of proving our result directly for ΣΠ (k) ΣΠ circuits, it will be very useful for us to go to the significantly strengthened class of ΣΓ (k) ΣΠ circuits and prove our result for that class. Thus we think of our circuit C as being expressed as where the C i can be arbitrary polynomial functions of the inputs feeding into them. Note that we define the size of a ΣΓ (k) ΣΠ circuit to be the maximum of the top fan-in T , and the maximum of the number of monomials in any of the polynomials Q i j feeding into the circuit. Thus we completely disregard the complexities of the various polynomial function gates at the second level. If we are able to prove a lower bound for this notion of size, then if the original circuit is actually a ΣΠ (k) ΣΠ circuit then it will also be as good a lower bound for the usual notion of size.
Our lower bound has two key steps. In the first step we prove the result in the special case where t ≤ n 2 . In the second step we show how to "almost" reduce to the case of t ≤ n 2 .
Step (1) : t ≤ n 2 . In the representation of C as a ΣΓ (k) ΣΠ circuit, the value of t is at most n 2 . Lower bounds for this case turn out to be similar to lower bounds for homogeneous depth-4 circuits. In this case we borrow ideas from prior works [13,18,29] and show that the dimension of projected shifted partial derivatives of C is not too large. Most importantly, we can use the chain rule for partial derivatives to obtain good bounds for this complexity measure, independent of the complexity of the various C i .
Recall however that in our final result, t can be actually much larger than n 2 . Indeed the circuit C can be very far from being homogeneous, and for general depth-4 circuits, we do not know good upper bounds on the complexity of shifted partial derivatives or projected shifted partial derivatives. Also, in general, it is not clear if these measures are really small for general depth-4 circuits. 13 It is here that the low algebraic rank of {Q i1 , Q i2 , . . . , Q it } proves to be useful, and that brings us to the crux of our argument.
Step (2) : Reducing to the case where t ≤ n 2 . A key component of our proof, which is formalized in Lemma 3.5 shows that over any field of characteristic zero (or sufficiently large characteristic), up to a translation, every polynomial in a set of polynomials can be written as a function of the homogeneous components of the polynomials in the transcendence basis.
More formally, there exists an a ∈ F N such that C(X + a) can be expressed as where for a degree-d polynomial F, Hom[F] denotes the d + 1-tuple of homogeneous components of F. Moreover, Q i1 , Q i2 , . . . , Q ik are the polynomials in the transcendence basis. The crucial gain in the above transformation is that the arity of each of the polynomials C i is (d +1)×k and not t (where d is an upper bound on the degrees of the Q i j ). Now by assumption k ≤ n, and moreover without loss of generality we can assume d ≤ n since homogeneous components of Q i j of degree larger than n can be dropped since they do not contribute to the computation of a degree-n polynomial. Thus we have essentially reduced to the case where t ≤ n 2 .
One loss by this transformation is that the polynomials {C i } might be much more complex and with much higher degrees than the original polynomials {C i }. However this will not affect the computation of our complexity measure. Another loss is that we have to deal with the translated polynomial C(X + a). This introduces some subtleties into our computation as it could be that Q i j (X) is a sparse polynomial but Q i j (X + a) is far from being sparse. Neither of these issues is very difficult to deal with, and we are able to get strong bounds for the measure, based on projected shifted partial derivatives, for such circuits. The proof of Lemma 3.5 essentially follows from Lemma 1.10.
The proof of Lemma 1.10 crucially uses a result of Dvir, Shpilka and Yehudayoff [8] which shows that up to some minor technical conditions (which are not very hard to satisfy), factors of a polynomial can be expressed as polynomials in the coefficients when viewing f as an element of F[X 1 , X 2 , . . . , . This is relevant since if a set of t polynomials is algebraically dependent, then there is a non-zero t-variate polynomial which vanishes when composed with this tuple. We use this vanishing to prove the lemma.
The PIT results follows a similar initial setup and use of Lemma 1.10. We then use a result of Forbes [9] to show that the polynomial computed by C has a monomial of small support, which is then detected using the standard idea of using Shpilka-Volkovich generators [40].

Organization of the paper
The rest of the paper is organized as follows. In Section 2, we state some preliminary definitions and results that are used elsewhere in the paper. In Section 3, we describe our use of low algebraic rank and prove Lemma 3.5. We prove Theorem 1.5 in Section 4 and Theorem 1.8 in Section 5. We end with some open questions in Section 6.

Preliminaries
In this section we introduce some notation and definitions for the rest of the paper. 2. By X, we mean the set {X 1 , X 2 , . . . , X N } of variables.
3. For a field F, we use F[X] to denote the ring of all polynomials in X 1 , X 2 , . . . , X N over the field F.
For brevity, we denote a polynomial P(X 1 , 4. The support of a monomial α is the set of variables which appear with a non-zero exponent in α.

5.
We say that a function f (N) is quasipolynomially bounded in N if there exists a positive absolute constant c, such that for all N sufficiently large, f (N) < exp(log c N). For brevity, if f is quasipolynomially bounded in N, we say that f is quasipolynomial in N.
6. In this paper, unless otherwise stated, F is a field of characteristic zero.
7. Given a polynomial P and a valid monomial ordering Π, the leading monomial of P is the monomial with a nonzero coefficient in P which is maximal according to Π. Similarly, the trailing monomial in P is the monomial which is minimal among all monomials in P according to Π.
8. All our logarithms are to the base e.

Algebraic independence
We formally defined the notion of algebraic independence and algebraic rank in Definition 1.1. For more on algebraic independence and related discussions, we refer the reader to the excellent survey by Chen, Kayal and Wigderson [4] and earlier papers [3,1]. For a tuple Q = (Q 1 , Q 2 , . . . , Q t ) of algebraically dependent polynomials, we know that there is a nonzero t-variate polynomial R (called a Q-annihilating polynomial) such that R(Q 1 , Q 2 , . . . , Q t ) is identically zero. A natural question is to ask, what kind of bounds on the degree of R can we show, in terms of the degrees of Q i . The following lemma of Kayal [16] gives an upper bound on the degree of annihilating polynomials of a set of degree-d polynomials. The bound is useful to us in our proof. Lemma 2.1 (Kayal [16]). Let F be a field and let Q = {Q 1 , Q 2 , . . . , Q t } be a set of polynomials of degree-d in N variables over the field F having algebraic rank k. Then there exists a Q-annihilating polynomial of degree at most (k + 1) · d k .

Complexity of homogeneous components
We start by defining the homogeneous components of a polynomial. where d is the degree of P.
We will use the following simple lemma whose proof is fairly standard using interpolation, and can be found in the paper [28], for instance. We sketch the proof here for completeness. Lemma 2.3. Let F be a field of characteristic zero, and let P ∈ F[X 1 , X 2 , . . . , X N ] be a polynomial of degree at most d, in N variables, such that P can be represented as where for every j ∈ [t], Q j is a polynomial in N variables, and C is an arbitrary polynomial in t variables. Then, there exist polynomials {Q i j : i ∈ [d + 1], j ∈ [t]}, and for every such that 0 ≤ ≤ d, there exist polynomials C ,1 ,C ,2 , . . . ,C ,d+1 satisfying

Moreover,
• if each of the polynomials in the set {Q j : j ∈ [t]} is of degree at most ∆, then every polynomial in the set {Q i j : i ∈ [d + 1], j ∈ [t]} is also of degree at most ∆; • if the algebraic rank of the set {Q j : j ∈ [t]} of polynomials is at most k, then for every i ∈ [d + 1], the algebraic rank of the set {Q i j : j ∈ [t]} of polynomials is also at most k.
Proof. The key idea is to start from P ∈ F[X] and obtain a new polynomial P ∈ F[X] [Z] such that for every such that 0 ≤ ≤ d, the coefficient of Z in P equals Hom [P]. Here, Z is a new variable. Such a P is obtained by replacing every occurrence of the variable X j (for each j ∈ [N]) in P by Z · X j . It is not hard to verify that such a P has the stated property. We now view P as a univariate polynomial in Z with the coefficients coming from F(X). Notice that the degree of P in Z is at most d. So, to recover the coefficients of a univariate polynomial of degree at most d, we can evaluate P at d + 1 distinct values of Z over F(X) and take an F(X) linear combination. In fact, if the field F is large enough, we can assume that all these distinct values of Z lie in the base field F and we only take an F linear combination. The properties in the "moreover" part of the lemma immediately follow from this construction, and we skip the details.

Roots of polynomials
We will crucially use the following result of Dvir, Shpilka, Yehudayoff [8].
Lemma 2.5 (Schwartz, Zippel, DeMillo, Lipton [5]). Let P be a non-zero polynomial of degree-d in N variables over a field F. Let S be an arbitrary subset of F, and let x 1 , x 2 , . . . , x N be random elements from S chosen independently and uniformly at random. Then The following corollary easily follows from the lemma above.
Corollary 2.6. Let P 1 , P 2 , . . . , P t be non-zero polynomials of degree-d in N variables over a field F. Let S be an arbitrary subset of F of size at least 2td, and let x 1 , x 2 , . . . , x N be random elements from S chosen independently and uniformly at random. Then

Approximations
We will use the following lemma of Saptharishi [35] for numerical approximations in our calculations.

Utilizing low algebraic rank
Let Q = {Q 1 , Q 2 , . . . , Q t } be a set of polynomials in N variables and degree at most d such that the algebraic rank of Q equals k. Without loss of generality, let us assume that B = {Q 1 , Q 2 , . . . , Q k } are an algebraically independent subset of C of maximal size. We now show that, in some sense, this implies that all the polynomials in Q can be represented as functions of polynomials in the set B. We make this notion formal in the lemma below, which is a restatement of Lemma 1.10. Proof. Let d be defined as max i {d i }. Let us consider any i such that i ∈ {k + 1, k + 2, . . . ,t}. From the statement of the lemma, it follows that the set of polynomials in the set B ∪ {Q i } are algebraically dependent. Therefore, there exists a nonzero polynomial A i in k + 1 variables such that A i (Q 1 , Q 2 , . . . , Q k , Q i ) ≡ 0. Without loss of generality, we choose such a polynomial with the smallest total degree. From the upper bound on the degree of the annihilating polynomial from Lemma 2.1, we can assume that the degree of A i is at most (k + 1)d k . Consider the polynomial A i (X,Y ) defined by We have the following observation about properties of A i .

Observation 3.2. A i satisfies the following conditions.
• A i is not identically zero.
• The Y degree of A i is at least one.
• Q i (X) is a root of the polynomial A i , when viewing it as a polynomial in the Y variable with coefficients coming from F(X).
Proof. We prove the items in sequence.
• If A i is identically zero, then it follows that Q 1 , Q 2 , . . . , Q k are algebraically dependent, which is a contradiction.
• If A i (X,Y ) does not depend on the variable Y , then by definition, it follows that A i (Q 1 , Q 2 , . . . , Q k ,Y ) does not depend on Y . Hence, A i (Q 1 , Q 2 , . . . , Q k , Q i ) does not depend on Q i but is identically zero. This contradicts the algebraic independence of Q 1 , Q 2 , . . . , Q k .
• This item follows from the fact that the polynomial obtained by substituting Y by Q i in A i equals A i (Q 1 , Q 2 , . . . , Q k , Q i ), which is identically zero.
Our aim now is to invoke Lemma 2.4 for the polynomial A i , but first, we need to verify that the conditions in the hypothesis of Lemma 2.4 are satisfied. Let the polynomial A i be defined as the first order derivative of A i with respect to Y . Formally, We proceed with the following claim, the proof of which we defer to the end.

Claim 3.3. The polynomial A i is not an identically zero polynomial and A i | Y =Q i is not identically zero.
For the ease of notation, we define Observe that L i is a polynomial in the variables X which is not identically zero and is of degree at most (k + 1)d k+1 . Let H be a subset of F of size 2t(k + 1)d k+1 . Then, for a uniformly random point a i picked from H N , the probability that L i vanishes at a i is at most 1/2t. We call the set of all points a i ∈ H N where L i vanishes as bad. Then, with a probability at least 1 − 1/2t, a uniformly random element of H N is not bad. Let a i ∈ F N be a "not bad" element. We can replace X j by X j + γ, where γ is the j th coordinate of a i and then for the resulting polynomial L i (X + a i ), the point (0, 0, . . . , 0) is not bad.
We are now ready to apply Lemma 2.4. Let Here, for every j, C j (X) = C j Q 1 (X), Q 2 (X), . . . , Q k (X) is a polynomial in the X variables and is the coefficient of Y j in A i (X,Y ) when viewed as an element of F[X] [Y ]. From the discussion above, we know that the following are true.
1. The polynomial A i (X + a i , Q i (X + a i )) is identically zero.
Therefore, by Lemma 2.4, it follows that there is a polynomial G i such that We also know that for every j ∈ {0, 1, . . . , (k + 1)d k }, C j (X + a i ) is a polynomial in the polynomials Q 1 (X + a i ), Q 2 (X + a i ), . . . , Q k (X + a i ). In other words, for a polynomial F i . In order to prove the lemma for all values of i ∈ {k + 1, k + 2, . . . ,t}, we observe that we can pick a single value of the translation a, which works for every i ∈ {k + 1, k + 2, . . . ,t}. Such an a exists because the probability that a uniformly random p ∈ H N is bad for some i is at most t · 1/2t = 1/2 and the translation corresponding to any such element a in H N which is not bad for every i will work. The statement of the lemma then immediately follows.
We now prove Claim 3.3.
Proof of Claim 3.3. We observed from the second item in Observation 3.2 that the degree of Y in A i is at least 1. Hence, A i is not identically zero. If A i | Y =Q i is identically zero, then it follows that {Q 1 , Q 2 , . . . , Q k , Q i } have an annihilating polynomial of degree smaller than the degree of A i , which is a contradiction to the choice of A i , as a minimum degree annihilating polynomial. Lemma 3.1 lets us express all polynomials in a set of polynomials as a function of the polynomials in the transcendence basis. However, the functional form obtained is slightly cumbersome for us to use in our applications. We now derive the following corollary, which is easier to use in our applications. Proof. Let i be such that i ∈ {k + 1, k + 2, . . . ,t}. From Lemma 3.1, we know that there exists an a ∈ F N and a polynomial W i such that Q i (X + a) = Hom ≤d i W i (Q 1 (X + a), Q 2 (X + a), . . . , Q k (X + a)) . (3.1) We will now show that Hom ≤d i W i (Q 1 (X + a), Q 2 (X + a), . . . , Q k (X + a)) is actually a polynomial in the homogeneous components of the various Q j (X + a) by the following procedure, which is essentially univariate polynomial interpolation.
• Let R(X) = W i (Q 1 (X + a), Q 2 (X + a), . . . , Q k (X + a)). We replace every variable X j in R by Z · X j for a new variable Z. We view the resulting polynomial R as an element of F(X)[Z], i. e., a univariate polynomial in Z with coefficients coming from the field of rational functions in the X variables.
• Now, observe that for any , the homogeneous component of degree-of R is precisely the coefficient of Z in R . Hence, we can evaluate R for sufficiently many distinct values of Z in F(X), and then take an F(X) linear combination of these evaluations to express the homogeneous components. Moreover, since F is an infinite field, without loss of generality, we can pick the values of Z to be scalars in F, and in this case, we will just be taking an F linear combination.
The catch here is that after replacing X j by Z ·X j and substituting different values of Z ∈ F, the polynomials Q i (X + a) could possibly lead to distinct polynomials. In general, this is bad, since our goal is to show that every polynomial in a set of algebraically dependent polynomials in a function of few polynomials. However, the following observation comes to our rescue. Let P be any polynomial in F[X] of degree-∆ and let P be the polynomial obtained from P by replacing X j by Z · X j . Then, In particular, the set of polynomials obtained from P for different values of Z are all in the linear span of homogeneous components of P. Therefore, any homogeneous component of R can be expressed as a function of the set Hom Q i (X + a) of polynomials. This completes the proof of the corollary.
We now prove the following lemma, which will be directly useful in the our applications to polynomial identity testing and lower bounds in the following sections.
Lemma 3.5. Let F be any field of characteristic zero or sufficiently large. Let P ∈ F[X] be a polynomial in N variables, of degree equal to n, such that P can be represented as and such that the following are true.
• For each i ∈ [T ], F i is a polynomial in t variables.
• For each i ∈ [T ] and j ∈ [t], Q i j is a polynomial in N variables of degree at most d.
• For each i ∈ [T ], the algebraic rank of the set {Q i j : j ∈ [t]} of polynomials is at most k and Then, there exists an a ∈ F N and polynomials F i in at most k(d + 1) variables such that Proof. The proof would essentially follow from the application of Corollary 3.4 to each of the summands on the right hand side. The only catch is that the translations a could be different for each one of them. Since we are working over infinite fields, without loss of generality, we can assume that there is a good translation a which works for all the summands.

Application to lower bounds
In this section , we prove Theorem 1.5. But, first we discuss the definitions of the complexity measure used in the proof, the notion of random restrictions and the family of hard polynomials that we work with.

Projected shifted partial derivatives
The complexity measure that we use to prove the lower bounds in this paper is the notion of projected shifted partial derivatives of a polynomial introduced by Kayal et al. in [18] and subsequently used in a number of following papers [29,19,28]. For a polynomial P and a monomial γ, ∂ P ∂ γ is the partial derivative of P with respect to γ and for a set of monomials M, ∂ M (P) is the set of partial derivatives of P with respect to monomials in M. The space of (M, m)-projected shifted partial derivatives of a polynomial P is defined below.
Here, Mult[P] of a polynomial P is the projection of P on the multilinear monomials in its support. We use the dimension of projected shifted partial derivative space of P with respect to some set of monomials M and a parameter m as a measure of the complexity of a polynomial. Formally, From the definitions, it is straightforward to see that the measure is subadditive. In the proof of Theorem 1.5, we need to upper bound the dimension of the span of projected shifted partial derivatives of the homogeneous component of a fixed degree of polynomials. The following lemma comes to our rescue there.  Proof. Since M is a subset of monomials of degree equal to r, all the partials derivatives are shifted by monomials of degree equal to m and the operation Mult[] either sets a monomial to zero or leaves it unchanged, it follows that the span of projected shifted partial derivatives of Hom i [P] coincides with the span of the homogeneous components of degree-(i − r)m in the space of span of projected shifted partial derivatives of P itself. The lemma then follows from the fact that dimension of a linear space of polynomials is at least as large as the dimension of the space obtained by restricting all polynomials to some fixed homogeneous component.
In the next lemma, we prove an upper bound on the polynomials which are obtained by a composition of low arity polynomials with polynomials of small support. Gupta et al. [13] first proved such a bound for homogeneous depth-4 circuit with bounded bottom fan-in.
Lemma 4.4. Let s be a parameter and Q 1 , Q 2 , . . . , Q t be polynomials in F[X] such that for every i ∈ [t], the support of every monomial in Q i is of size at most s. Then, for every polynomial F in t variables, every choice of parameters r, m such that m + rs ≤ N/2, and every set M of monomials of degree equal to r, Proof. By the chain rule for partial derivatives, every derivative of order r of F(Q 1 , Q 2 , . . . , Q t ) can be written as a linear combination of products of the form 3. for every 1 ≤ j ≤ r, the polynomial P j is an element of {Q 1 , Q 2 , . . . , Q t }, and 4. for every 1 ≤ j ≤ r, β j is a monomial in variables X 1 , X 2 , . . . , X N .
Since every monomial in each Q i is of support at most s, every monomial in each of the products is of support at most rs. Therefore, for shifts of degree-m, the projected shifted partial derivatives of F(Q 1 , Q 2 , . . . , Q t ) (with respect to monomials in M which are of degree-r) are in the linear span of polynomials of the form where α is a multilinear monomial 14 of degree at most m + rs. Therefore, the dimension of this space is upper bounded by the number of possible choices of β 0 and α. Hence

Target polynomials for the lower bound
In this section, we define the family of polynomials for which we prove our lower bounds. The family is a variant of the Nisan-Wigderson polynomials which were introduced by Kayal et al. in [20], and subsequently used in many other results [29,19,28]. We start with the following definition.
Definition 4.5 (Nisan-Wigderson polynomial families). Let n, q, e be arbitrary parameters with q being a power of a prime, and n, e ≤ q. We identify the set [q] with the field F q of q elements. Observe that since n ≤ q, we have that [n] ⊆ F q . The Nisan-Wigderson polynomial with parameters n, q, e, denoted by NW n,q,e is defined as The number of variables in NW n,q,e as defined above is N = q · n. The lower bounds in this paper will be proved for the polynomial NW • Lin which is a variant of the polynomial NW n,q,e defined as follows.
Definition 4.6 (Hard polynomials for the lower bound). Let δ ∈ (0, 1) be an arbitrary constant, and let p = N −δ . Let The polynomial NW • Lin q,n,e,p is defined as NW • Lin q,n,e,p = NW q,n,e For brevity, we will denote NW • Lin q,n,e,p by NW • Lin for the rest of the discussion. The advantage of using this trick 15 of composing with linear forms is that it becomes cleaner to show that the polynomial NW • Lin is robust under random restrictions where every variable is kept alive with a probability p. Since δ is an absolute constant, the number of variables in NW • Lin is at most N O (1) . We now formally define our notion of random restrictions.
Let V be the set of variables in the polynomial NW • Lin. We now define a distribution D p over the subsets of V.
The distribution D p : Each variable in V is independently kept alive with a probability p = N −δ .
The random restriction procedure samples a V ← D and then keeps only the variables in V alive. The remaining variables are set to 0. We denote the restriction of the polynomial obtained by such a restriction as NW • Lin| V . Observe that a random restriction also results in a distribution over the restrictions of a circuit computing the polynomial NW • Lin. We denote by C| V the restriction of a circuit C obtained by setting every input gate in C which is labeled by a variable outside V to 0.
We now show that with a high probability over restrictions sampled according to D p , the projected shifted partial derivative complexity of NW • Lin remains high. We need the following lower bound on the dimension of projected shifted partial derivatives of NW n,q,e . Lemma 4.7 ([29,25]). For every n and r = O( √ n) there exists parameters q, e, ε such that q = Ω(n 2 ), N = qn and ε = Θ(log(n)/ √ n) with For any {n, q, e, r, ε} satisfying the above constraints, and for m = (N/2)(1 − ε), over any field F, we have Φ(NW n,q,e ) ≥ N m + n − r · exp(−O(log 2 n)) .
We will instantiate the lemma above with the following choice of parameters.
• We will set the parameter s to be equal to √ n 100 . It is straightforward to check that for the above choice of parameters, there is a choice of e such that q r ≥ (1 + ε) 2(n−r) , q e−r = 2 1 + ε n−r · Poly(q) .
We are now ready to prove our main lemma for this section.
Proof. To prove the lemma, we first show that with a high probability over the random restrictions, the polynomial P| V has the polynomial NW n,q,e as a projection by setting some variables to zero. Combining this with Lemma 4.7 would complete the proof. We now fill in the details. Let i ∈ [N]. Then, the probability that all the variables in the set A i, j = {X i, j, : ∈ [γ]} are set to zero by the random restrictions is equal to (1 − p) γ ≤ exp(−Θ(N)). Therefore, the probability that there exists an i ∈ [n], j ∈ [q] such that all the variables in the set A i, j are set to zero by the random restrictions, is at most N · exp(−Θ(N)) = o (1). We now argue that if this event does not happen (which is the case with probability at least 1 − o(1)), then the dimension of the projected shifted partial derivatives is large.
For every i, j, let A i, j be the subset of A i, j which has not been set to zero. We know that for every i, j, A i, j is non-empty. Now, for every i, j, we set all the elements of A i, j to zero except one. Observe that the polynomial obtained from NW • Lin after this restriction is exactly the polynomial NW n,q,e up to a relabeling of variables. Now, from Lemma 4.7, our claim follows.

Proof of Theorem 1.5
To prove our lower bound, we show that under a random restriction from the distribution D p , the dimension of the linear span of projected shifted partial derivatives of any ΣΠ (n) ΣΠ circuit C is small with a high probability if the size of the C is not too large. Comparing this with the lower bound on the dimension of projected shifted partials of the polynomial NW • Lin under random restrictions from Lemma 4.8, the lower bound follows. We now proceed along this outline and prove the following lemma. Lemma 4.9 (Upper bound on complexity of circuits). Let m, r, s be parameters such that m + rs ≤ N/2. Let M be any set of multilinear monomials of degree-r. Let C be an arithmetic circuit computing a homogeneous polynomial of degree-n such that Proof. We prove the lemma by first using random restrictions to simplify the circuit into one with bounded bottom support, and then utilizing the tools tools developed in Section 3 and Section 4.1 to conclude that the dimension of the space of projected shifted partial derivatives of the resulting circuit is small.
Step (1): Random restrictions. From the definition of random restrictions, every variable is kept alive independently with a probability p = N −δ . So, the probability that a monomial of support at least s survives the restrictions is at most N −δ s . Therefore, by linearity of expectations, the expected number of monomials of support at least s in i∈[T ], j∈[t] S i j which survive the random restrictions is at most So, by Markov's inequality, the probability that at least one monomial of support at least s in i∈[T ], j∈[t] S i j survives the random restrictions is o(1). Let V be any subset of the surviving set of variables of size N.
For the rest of the proof, we assume that all the variables outside the set V are set to zero. Restrictions which set all monomials of support at least s in i∈[T ], j∈[t] S i j to zero are said to be good.
Step (2): Using low algebraic rank. In this step, we assume that we are given a good restriction C of the circuit C. Let where for every i ∈ [T ], j ∈ [t], all monomials of Q i j have support at most s. Observe that random restrictions cannot increase the algebraic rank of a set of polynomials. Therefore, for every i ∈ [T ], the algebraic rank of the set {Q i j : j ∈ [t]} of polynomials is at most k. For ease of notation, let us assume that the algebraic rank is equal to k. Without loss of generality, let the set B i = {Q i1 , Q i2 , . . . , Q ik } be the set guaranteed by Lemma 3.5. We know that there exists an a ∈ F N and polynomials {F i : i ∈ [T ]} such that , Hom Q i2 (X + a) , . . . , Hom Q ik (X + a) ) . Moreover, since C(X) (and hence C (X)) is a homogeneous polynomial of degree-n, the following is true.
An important observation here is that for the rest of the argument, we can assume that the degree of every polynomial Q i j (X + a) is at most n. If not, we can simply replace any such high degree Q i j (X + a) by Hom ≤n Q i j (X + a) .
We claim that the equality 4.3 continues to hold. This is because the higher degree monomials of Q i j do not participate in the computation of the lower degree monomials. The only monomials which could potentially change by this substitution are the ones with degree strictly larger than n.
Step (3): Upper bound on Φ M,m (C (X)). Let R be defined the polynomial + a) , . . . , Hom Q ik (X + a) ) . Note that if the support of every monomial in a polynomial Q i j (X) is at most s, then for every translation a ∈ F N the support of every monomial in Q i j (X + a) is also at most s. From Lemma 4.4 and from Lemma 4.2, it is easy to see that

From Lemma 4.3, it follows that
Observe that steps (2) and (3) of the proof are always successful if the restriction in step 1 is good, which happens with a probability at least 1 − o(1). So, the lemma follows.
We now complete the proof of Theorem 1.5.
Proof of Theorem 1.5. If the size of the circuit C is at least N (δ /2) √ n , then we are done. Else, the size of C is at most N (δ /2) √ n . This implies that the total number of monomials in all the polynomials Q i j together is at most N (δ /2) √ n . From Lemma 4.9 and Lemma 4.8, it follows that there exists a subset V of variables of size N such that both the following inequalities are true. Plugging in the value of the parameters from Section 4.2, and approximating using Lemma 2.7, we immediately get Moreover, k(n+1)+r r ≤ (enk) r ≤ exp(2 √ n · log n). Taking the ratio and substituting the values of the parameters, we get T ≥ exp (Ω( √ n log N)) .
In this section we give an application of the ideas developed in Section 3 to the question of polynomial identity testing and prove Theorem 1.8. We start by formally defining the notion of a hitting set.
Hitting set. Let S be a set of polynomials in N variables over a field F. Then, a set H ⊆ F N is said to be a hitting set for the class S, if for every polynomial P ∈ S such that P is not identically zero, there exists a p ∈ H such that P(p) = 0. For our PIT result, we show that any nonzero polynomial P in the circuit class we consider, has a monomial of low support. A hitting set can then be constructed by the standard techniques using the Shpilka-Volkovich generator [40]. The following lemma is our main technical claim.
Lemma 5.2. Let F be a field of characteristic zero. Let P be a homogeneous polynomial of degree-∆ in N variables such that P can be represented as such that the following are true.
• For each i ∈ [T ], C i is a polynomial in t variables.
• For each i ∈ [T ] and j ∈ [t], Q i j is a polynomial of degree at most d in N variables.
• For each i ∈ [T ], the algebraic rank of the set {Q i j : j ∈ [t]} of polynomials is at most k.
Then, the trailing monomial of P has support at most Here, e is Euler's constant.
In order to prove Lemma 5.2, we follow the outline of proving robust lower bounds for arithmetic circuits, described and used by Forbes [9]. This essentially amounts to showing that the trailing monomial of P has small support. We use the following result of Forbes [9] in a blackbox manner which greatly simplifies our proof. 17 See Corollary 3.15 in [9]. Lemma 5.3 (Proposition 4.18 in Forbes [9]). Let R(X) be a polynomial in F[X] such that and for each i ∈ [T ] and j ∈ [t], the degree of Q i j is at most d. Let α be the trailing monomial of R. Then, the support of α is at most 2e 3 d(ln T + t ln 2t + 1), where e is Euler's constant.
We now proceed to prove Lemma 5.2.
Proof of Lemma 5.2. Recall that our goal is to show that the polynomial P, which can be represented as has a trailing monomial of small support.
For every i ∈ [T ], let Q i = {Q i1 , Q i2 , . . . , Q it } and let Q i be of algebraic rank k i . Without loss of generality, let us assume the sets B i = {Q i1 , Q i2 , . . . , Q ik i } are the sets guaranteed by Lemma 3.5. This implies that there exist polynomials F 1 , F 2 , . . . , F T and a ∈ F N such that Since each k i ≤ k, for the ease of notation, we assume that each k i = k. Observe that if P is a homogeneous polynomial of degree deg(P) ≤ ∆, then, Moreover, every polynomial in the set {Q i j : i ∈ [T (∆+1)], j ∈ [k]} has degree at most d. Now, Lemma 5.3 implies that the trailing monomial α of P(X) has support at most 2e 3 d · (ln (T (∆ + 1)) + (d + 1)k ln (2(d + 1)k) + 1) .
We are now ready to complete the proof of Theorem 1.8.
Proof of Theorem 1.8. From Definition 1.2, it follows there could be non-homogeneous polynomials P ∈ C. So, we cannot directly use Lemma 5.2 to say something about them, since the proof relies on homogeneity. But, this is not a problem, since a polynomial is identically zero if and only if all its homogeneous components are identically zero. Moreover, by applying Lemma 2.3 to every summand feeding into the top sum gate of the circuit, we get that every homogeneous component of P 18 can also be computed by a circuit similar in structure to that of P at the cost of a blow up by a factor ∆ + 1 in the top fan-in. We can then apply Lemma 5.2 to each of these homogeneous components to conclude that if P is not identically zero, then it contains a monomial of support at most 2e 3 d · (ln T (∆ + 1) 2 + (d + 1)k ln (2(d + 1)k) + 1) .
Theorem 1.8 immediately follows by detecting the low support monomial using Lemma 5.2 and Lemma 5.1.

Open questions
We conclude with some open questions.
• Prove the lower bounds in the paper for a polynomial in VP. We believe this is true, but it seems that we need a strengthening of the bounds proved in [29]. In particular, it needs to be shown that the lower bound for IMM (Iterated matrix multiplication) continues to hold when a depth-4 circuit is not homogeneous but the formal degree is at most the square of the degree of the polynomial itself.
• It would be interesting to see if there are other applications of Lemma 1.10 to questions in complexity theory. The Jacobian characterization of algebraic independence has several very interesting applications [1,6].
SHUBHANGI SARAF grew up in Pune, India. She received her Ph. D. in computer science from the Massachusetts Institute of Technology in 2011 under the guidance of Madhu Sudan. Shubhangi is broadly interested in complexity theory, coding theory and pseudorandomness. Recently she has been captivated by questions related to understanding the power and limitations of algebraic computation, as well as to understanding the potential of locality in algorithms for codes.
Shubhangi discovered her love for mathematics in her high school years at the Bhaskaracharya Pratishthana, an educational and research institute in mathematics in Pune, under the guidance and mentoring of her teacher Mr. Prakash Mulabagal. Mr. Prakash ran an amazing program aimed at getting high school students from across Pune introduced to the joy of math and the sciences beyond what any school curriculum in Pune could possibly attempt to do. Shubhangi owes a great deal of her enthusiasm for math problem solving to Mr. Prakash, and also to being able, through the Bhaskaracharya Pratishthana program, to make close friends in Pune who were into the same thing.
Thanks to this nurturing environment, Shubhangi got involved in math competitions and represented India twice at the International Mathematical Olympiad (IMO), once winning a bronze medal (2002) and once a silver (2003).
She went on to do her undergraduate studies in Mathematics at MIT, graduating in 2007. She did not really know that she wanted to stay on in academia until her junior year when she spent a year abroad as a mathmo at Cambridge University in the UK where she took fantastic courses by Tim Gowers and Imre Leader. Once back at MIT, in summer 2006, she did a research project with Igor Pak at MIT, which gave her a lot of confidence and encouragement. She was also fortunate to take some more great courses at MIT; "Randomized algorithms" by David Karger and "Complexity theory" by Madhu Sudan were particularly influential. The support and encouragement from her MIT mentors eventually got her on the path to theoretical computer science.
In her spare time Shubhangi enjoys reading, cooking, long walks, and exploring cafés and restaurants. Her little toddler is a constant source of joy and amazement, and she also makes sure there isn't much time to spare.