Closure Results for Polynomial Factorization

: In a sequence of fundamental results in the 1980s, Kaltofen (SICOMP 1985, STOC’86, STOC’87, RANDOM’89) showed that factors of multivariate polynomials with small arithmetic circuits have small arithmetic circuits. In other words, the complexity class VP is closed under taking factors. A natural question in this context is to understand if other natural classes of multivariate polynomials, for instance, arithmetic formulas, algebraic branching programs, bounded-depth arithmetic circuits or the class VNP , are closed under taking factors. In this paper, we show that all factors of degree log a n of polynomials with poly ( n ) - size depth-k circuits have poly ( n ) -size circuits of depth O ( k + a ) . This partially answers a question of Shpilka–Yehudayoff (Found. Trends in TCS, 2010) and has applications to hardness–randomness tradeoffs for bounded-depth arithmetic circuits. As direct applications of our techniques, we also obtain simple proofs of the following results.

• The complexity class VNP is closed under taking factors.This confirms Conjecture 2.1 in Bürgisser's monograph (2000) and improves upon a recent result of Dutta, Saxena and Sinhababu (STOC '18) who showed a quasipolynomial upper bound on the number of auxiliary variables and the complexity of the verifier circuit of factors of polynomials in VNP.
• A factor of degree d of a polynomial P which can be computed by an arithmetic formula (or an algebraic branching program) of size s has a formula (an algebraic branching program, resp.) of size poly(s, d log d , deg(P)).This result was first shown by Dutta et al. (STOC '18) and we obtain a slightly different proof as an easy consequence of our techniques.
Our proofs rely on a combination of the ideas, based on Hensel lifting, developed in the polynomial factoring literature, and the depth-reduction results for arithmetic circuits, and hold over fields of characteristic zero or of sufficiently large characteristic.

Introduction
A fundamental question in computational algebra is the question of polynomial factorization: Given a polynomial P, can we efficiently compute the factors of P? In this paper, we will be interested in the following closely related question: Given a structured polynomial P, what can we say about the structure of factors of P?
In a sequence of seminal papers, Kaltofen [15,16,17,18] showed that if a polynomial P of degree d in n variables has an arithmetic circuit of size s, then each of its factors has an arithmetic circuit of size poly(s, n, d).Moreover, he also showed that given the circuit for P, the circuits for its factors can be computed in time poly(s, n, d) by a randomized algorithm.
Another way of stating this result is that the complexity class VP, which we now define, is uniformly closed under taking factors.Definition 1.1 (VP).A family { f n } of polynomials over a field F is said to be in the class VP F if there exist polynomially bounded functions d, k, v : N → N and a circuit family {g n } such that deg( f n ) ≤ d(n), size(g n ) ≤ s(n), and f n is computed by g n for every sufficiently large n ∈ N.
We remark that factorization is a fundamental algebraic notion, and so closure under factorization indicates that a complexity class is algebraically nice in some sense.Thus, it is a natural question to ask if any of the other naturally and frequently occurring classes of polynomials like VF (polynomials with small formulas), VBP (polynomials with small algebraic branching programs), bounded-depth arithmetic circuits, or the class VNP (the algebraic analog of NP or #P) are closed under taking factors.
In recent years, we have had some progress on the question of closure under factorization for boundeddepth arithmetic circuits (see [8,29]) or the classes VF, VBP and VNP (see [7]).We will discuss these results in a later part of this section.
In addition to being basic questions in algebraic complexity, some of these closure results also have applications to extending the hardness vs. randomness framework of Kabanets and Impagliazzo [13] to formulas, branching programs or bounded-depth arithmetic circuits.Indeed, Kaltofen's closure result for arithmetic circuits is crucial ingredient in the proof of Kabanets and Impagliazzo [13].

Hardness and randomness
Two of the most basic questions in algebraic complexity theory are the question of proving superpolynomial lower bounds on the size of arithmetic circuits computing some explicit family of polynomials 1and that of designing efficient deterministic algorithms for Polynomial Identity Testing (PIT).
The progress on these questions for general arithmetic circuits has been painfully slow.To date, there are no non-trivial algorithms for PIT for general arithmetic circuits, while the best known lower bound on the circuit size for explicit families of polynomials, due to Bauer and Strassen [3], is a slightly superlinear lower bound Ω(n log n), proved over three decades ago.In fact, even for the class of bounded-depth arithmetic circuits, no non-trivial deterministic PIT algorithms are known, and the best circuit lower bounds known are just slightly superlinear [32].
In a very influential work, Kabanets and Impagliazzo [13] showed that the questions of derandomizing PIT and that of proving lower bounds for arithmetic circuits are equivalent in some sense.Their result adapts the Hardness vs. Randomness framework of Nisan and Wigderson [27] to the algebraic setting.In their proof, Kabanets and Impagliazzo combine the use of the Nisan generator [26] with Kaltofen's result that all factors of a low-degree (degree poly(n)) polynomial with a poly(n)-size circuit are computable by poly(n)-size circuits [18].They showed that given an explicit family of hard polynomials, one can obtain a non-trivial2 deterministic algorithm for PIT.
The extremely slow progress on the circuit lower bound and PIT questions for general circuits has led to a lot of attention on understanding these questions for more structured subclasses of arithmetic circuits.Arithmetic formula [14], algebraic branching programs [21], multilinear circuits [31,35,34], and bounded-depth arithmetic circuits [28,32,11,10,23] are some examples of such circuit classes.An intriguing question is to ask if the equivalence of PIT and lower bounds also carries over to these more structured circuit classes.For example, do superpolynomial lower bounds for arithmetic formulas imply non-trivial deterministic algorithms for PIT for arithmetic formulas, and vice versa?
The answers to these questions do not follow directly from the results in [13], and extending the approach of Kabanets and Impagliazzo to answer these questions seems to be intimately related to the questions about closure of arithmetic formulas and bounded-depth circuits under polynomial factorization.
We now describe our results, and discuss how they relate to prior work.
2 Results and prior work

Factors of polynomials with bounded-depth circuits
For our first set of results, we study the bounded-depth circuit complexity of factors of polynomials which have small bounded-depth circuits.We prove the following result.Thus, low-degree factors of polynomials with small shallow circuits have small shallow circuits.This partially answers an open problem (Open Problem 19 in [38]) of Shpilka and Yehudayoff who asked whether the factors of a multivariate polynomial with a small shallow circuit can also be computed by small shallow circuits.
Our proof gives a smooth tradeoff between the depth of the circuit for the factor and its size.The tradeoff is governed by the depth-reduction results for arithmetic circuits (see Theorem 4.3).We remark that the result is also true when the characteristic of the underlying field is sufficiently large.The result in the literature, which is most closely related to Theorem 2.1, is due to Oliveira [29].He studied the question of bounded-depth circuit complexity of factors of polynomials with small bounded-depth circuits, for polynomials of low individual degree.He showed that if a polynomial P of individual degree r is computable by a circuit of size s and depth ∆, then every factor of P of degree d can be computed by a circuit of size poly(s, r, d r ) and depth ∆ + 5. Thus, for polynomials with small individual degree, the results in [29] are strictly better than ours, whereas for polynomials with unbounded individual degree, we get a better upper bound on the complexity of factors of total degree poly(log n).
One of our main motivations for studying this question is the connection to hardness-randomness tradeoffs for bounded-depth arithmetic circuits.In the next section, we describe the implications of our results in this context.

Hardness vs. randomness for bounded-depth circuits
Dvir, Shpilka and Yehudayoff [8] initiated the study of the question of the equivalence between PIT and lower bounds for bounded-depth circuits.Dvir et al. observed that a part of the proof in [13] can be generalized to show that non-trivial PIT for bounded-depth circuits implies lower bounds for such circuits.For the converse, the authors only showed a weaker statement; they proved that superpolynomial lower bounds for depth-∆ arithmetic circuits imply non-trivial PIT for depth-(∆ − 5) arithmetic circuits with bounded individual degree.The bounded individual degree condition is a bit unsatisfying, and so, the following question is of interest: Does a superpolynomial lower bound for depth-∆ arithmetic circuits imply non-trivial deterministic PIT for depth-∆ arithmetic circuits? 3In particular, can we get rid of the "bounded individual degree" condition from the results in [8]?
In this paper, we partially answer this question in the affirmative.Here is an informal statement of the result.

Theorem 2.2 (Informal).
A superpolynomial lower bound for depth-∆ arithmetic circuits for an explicit family of low-degree polynomials implies non-trivial deterministic PIT for depth-(∆ − 5) arithmetic circuits.
Here, by low-degree polynomials, we mean polynomials in n variables and of degree at most O(log 2 n/ log 2 log n).Thus, by strengthening the hardness hypothesis in [8], we remove the bounded individual degree restriction from the implication.We now state the result in Theorem 2.2 formally. 3Here, we think of ∆ as ∆ − O(1).
Theorem 2.3.Let ∆ ≥ 6 be a positive integer, and let ε > 0 be any real number.Let { f m } be an explicit family of polynomials such that f m is an m-variate multilinear polynomial of degree d = O log 2 m/log 2 log m which cannot be computed by an arithmetic circuit of depth ∆ and size poly(m).Then, there is a deterministic algorithm, which, given as input a circuit C ∈ Q[x] of size s, depth ∆ − 5 and degree D on n variables, runs in time (snD) O(n 2ε ) and determines if the polynomial computed by C is identically zero.Some remarks on the above theorem statement.
Remark 2.4.The running time of the PIT algorithm gets better as the lower bound gets stronger.Also, the constraint on the degree of family of hard polynomials can be further relaxed a bit, at the cost of strengthening the hardness assumption, and increasing the running time of the resulting PIT algorithm. 4e leave it to the interested reader to work out these details.We also note that the multilinearity assumption on the family of hard polynomials is without loss of generality.
As discussed earlier, Theorem 2.3 is closely related to the main result in [8].We now discuss their similarities and differences.
• Degree constraint on the hard polynomial.While Theorem 2.3 requires that the hard polynomial on m variables has degree O(log 2 m/ log 2 log m), Dvir et al. [8] did not have a similar constraint.
• Individual degree constraint for PIT.In [8], the authors get PIT for shallow circuits with bounded individual degree, whereas our Theorem 2.3 does not make any assumptions on individual degrees in this context.
The key technical challenge for extending the known hardness-randomness tradeoffs for general circuits [13] to restricted circuit classes like formulas or bounded-depth circuits is the following question: Let P(x, y) ∈ F[x, y] be a polynomial of degree r and let f ∈ F[x] be a polynomial of degree d such that P(x, f ) ≡ 0. Assuming P can be computed by a shallow circuit (or arithmetic formula) of size s, can f be computed by a shallow circuit (or arithmetic formula) of size poly(s, n, d, r)?
In [8], the authors partially answer this question by showing that the polynomial f can be computed by a shallow circuit of size poly(s, r, d deg y (P) ).Thus, for the case of polynomials P which have small individual degree with respect to y, they answer the question in the affirmative.
Our main technical observation, which we state next, gives an upper bound on the shallow circuit complexity of polynomials f (x) ∈ F[x] of low degree such that there is a polynomial P(x, y) ∈ F[x, y] with small shallow circuits satisfying P(x, f ) = 0.In other words, if we view P as a univariate polynomial in y with coefficients coming from the ring F[x], then f is a low-degree polynomial that is a root of P. We now state the theorem.
Theorem 2.5.Let F be a field of characteristic zero.Let P ∈ F[x, y] be a polynomial of degree r in n + 1 variables that can be computed by an arithmetic circuit of size s and depth ∆.Let f ∈ F[x] be a polynomial of degree d such that P(x, f ) = 0 .
Then, f can be computed by a circuit of depth ∆ + 3 and size O((srn ).Furthermore, for any natural number k, f can be computed by a circuit of depth ∆ + O(k) and size O((srn) 10 d O(d 1/k ) ).
We conclude this section with a short discussion on the low-degree condition in the hypothesis of Theorem 2.3.

The low-degree condition
The low-degree condition in the hypothesis of Theorem 2.3 appears to be extremely restrictive.It is natural to wonder if the question of proving superpolynomial lower bounds for bounded-depth circuits for an explicit family of polynomials of low degree is much harder than the question of proving superpolynomial lower bounds for bounded-depth circuits for an explicit family of polynomials of potentially larger degree? 5 Currently, we do not even know quadratic lower bounds for arithmetic circuits of bounded depth, and so, perhaps we are quite far from understanding this question.
It is, however, easy to see that some of the known lower bounds for shallow circuits carry over to the low-degree regime.For instance, the proofs of superpolynomial lower bounds for homogeneous depth-3 circuits by Nisan and Wigderson [28], superpolynomial lower bounds for homogeneous depth-4 circuits based on the idea of shifted partial derivatives (see for example, [11,19,10,23]) and superlinear lower bound due to Raz [32] do not require the degree of the hard function to be large.
There are some known exceptions to this.For instance, lower bounds for homogeneous depth-5 circuits over finite fields due to Kumar and Saptharishi [22] are of the form 2 Ω( √ d) and become trivial if d < log 2 n.Another result which distinguishes the low-degree and high-degree regimes is a separation between homogeneous depth-5 and homogeneous depth-4 circuit [22] which is only known to be true in the low-degree regime (degree less than log 2 n).
Another result of relevance is a result of Raz [33], which shows that constructing an explicit family of tensors T n : [n] d → F, of rank at least n d(1−o(1)) implies superpolynomial lower bound for arithmetic formulas, provided d ≤ O(log n/ log log n).As far as we know, we do not know of such connections in the high-degree regime.
One prominent family of lower bound results which do not seem to generalize to this low-degree regime are the superpolynomial lower bounds for multilinear formulas [31], and multilinear boundeddepth circuits [35].In fact, the results in [33] show that superpolynomial lower bounds for set multilinear formulas 6 for polynomials of degree O(log n/ log log n) imply superpolynomial lower bounds for general arithmetic formulas.
In the context of polynomial factorization, low-degree factors of polynomials with small circuits have been considered before.For instance, Forbes [9] gave a quasipolynomial-time deterministic algorithm to test if a given polynomial of bounded degree divides a given sparse polynomial.Extending this result to even testing if a given sparse polynomial divides another given sparse polynomial remains an open problem.

Factors of polynomials in VNP
We start by formally defining the complexity class VNP.Definition 2.6 (VNP).A family of polynomials { f n } over a field F is said to be in the class VNP F if there exist polynomially bounded functions k, w, v : N → N and a family {g n } in VP F such that for every sufficiently large n ∈ N, We refer to the y variables in the definition above as auxiliary variables, and the polynomial family g n as the family of verifier polynomials.Essentially, VNP can be thought of as the algebraic analog of NP, and understanding if VNP is different from VP is the algebraic analog of the famous P vs. NP question.As discussed earlier in this section, Kaltofen As a direct application of our proof of Theorem 2.1, we confirm this conjecture over fields of characteristic zero or of sufficiently large characteristic.We obtain a simple proof of the following statement.
Theorem 2.8 (Informal).The class VNP is closed under taking factors.
The main technical statement which immediately gives us this closure result is the following theorem.
Theorem 2.9.Let F be a field of characteristic zero.Let P(x) be a polynomial of degree r over F, and let Q(x, y) be a polynomial in n + m variables such that and Q can be computed by a circuit of size s.Let f be any irreducible factor of P of degree d.Then, there exists an m ≤ poly(s, r, d, n, m) and polynomial h(x 1 , x 2 , . . ., x n , z 1 , z 2 , . . ., z m ) where h(x, z) can be computed by a circuit of size s ≤ poly(s, r, d, n, m) such that THEORY OF COMPUTING, Volume 15 (13), 2019, pp.We remark that in the proof of the above theorem, our techniques can be replaced by analogous statements from [8,29].Although this is a simple observation, this does not appear to have been noticed prior to this work.The best upper bound on the complexity of factors of polynomials in VNP in prior work is a recent result of Dutta, Saxena, Sinhababu [7], who showed a bound of poly(n, r, s, m, d O(log d) ) on the number of auxiliary variables and the circuit complexity of verifier polynomials h.
As an easy consequence of our proofs, we also obtain another (slightly different) proof of the following result of Dutta et al. [7].
Theorem 2.10 (Dutta, Saxena, Sinhababu).Let P(x) be a polynomial of degree r in n variables which can be computed by an arithmetic formula (or algebraic branching program) of size s, and let f (x) be a factor of P of degree d.Then, f (x) can be computed by an arithmetic formula (algebraic branching program, resp.) of size poly(s, r, n, d O(log d) ).

Proof overview
The key technical ingredients of our results in this paper is Theorem 2.5.We start by describing the main steps in its proof.
Proof sketch of Theorem 2.5.Our proof of Theorem 2.5 follows the outline of the proof of the analogous theorem about the structure of roots in [8].We now outline the main steps, and point out the differences between the proofs.The first step in the proof is to show that one can use the standard Hensel Lifting to iteratively obtain better approximations of the root f given a circuit for P(x, y).More formally, in the k th step, we start with a polynomial h k which agrees with f on all monomials of degree k, and use it to obtain a polynomial h k+1 which agrees with f on all monomials of degree k + 1.Moreover, the proof shows that if h k has a small circuit, then h k+1 has a circuit which is only slightly larger than that of h k .This iterative process starts with the constant term of f , which trivially has a small circuit.Thus, after d iterations, we have a polynomial h d such that the root f is the sum of the homogeneous components of h d of degree d.This lifting step is exactly the same as that in [8] or in some of the earlier works on polynomial factorization [5], and is formally stated in Lemma 5.1.
The key insight of Dvir et al. [8] was that if deg y (P) = t, and C 0 (x),C 1 (x), . . .,C t (x) are polynomials such that P(x, y) = ∑ t i=1 C i (x)y t , then for every k ∈ {0, 1, . . ., d}, we have a polynomial Now, consider the case when t n (for instance t = O(1)).It follows from standard interpolation results for shallow circuits (see Lemma 4.9) that each of the polynomials C i (x) has a circuit of size O(sr) and depth ∆ since P has a polynomial of size s and depth ∆.Thus, h d (x) can be written as a sum of d+t t = O(d t ) monomials if we treat each C i as a formal variable.Plugging in the small depth-∆ circuits for each C i , and standard interpolation (Lemma 4.9), it follows that f has a circuit of size poly(s, n, d t ) of depth ∆ + O (1).
Observe that this size bound of poly(s, n, d t ) is small only when t is small.For instance, when t > n, this bound becomes trivial.Our key observation is that independently of t, there is a set of d + 1 THEORY OF COMPUTING, Volume 15 (13), 2019, pp.1-34 polynomials g 0 (x), g 1 (x), . . ., g d (x) of degree d, and polynomials A 0 , A 1 , . . ., A k on d + 1 variables such that for every k ∈ {0, 1, . . ., d}, Moreover, for every k, A k has degree k and is computable by a circuit of size O(d 3 ).Also, each of these generators g i can be computed by a circuit of size poly(s, r) and depth ∆.Thus, expressing A d (z 0 , z 1 , . . ., z d ) as a sum of monomials, and then composing this representation with the circuits for g 0 , g 1 , . . ., g d would give us a circuit of size poly(s, n, r, d, 4 d ) of depth ∆ + O(1).To get a subexponential dependence on d in the size, we do not write , using the depth-reduction result of [12]. 7ne point to note is that just from Kaltofen's result [18], it follows that f has an arithmetic circuit 8  Proof sketch of Theorem 2.1.To get Theorem 2.1 from Theorem 2.5, we also have to upper bound the complexity of factors which are not of the form y − f (x), i. e., are non-linear in every variable.This involves the use of some standard techniques in this area.We first preprocess P such that it is monic in y, and then we work over the algebraic closure of the field F[x], and view P as a univariate in y over this field.We then use Lemma 5.1 to approximate these roots by polynomials, and eventually combine them using Lemma 6.3 from [29] to obtain the factor f .We get bounds on the circuit size and depth of the factor f by keeping tab on the growth of these parameters in each step of the outlined algorithm.
Proof sketch of Theorem 2.3.Theorem 2.5, when combined with the standard machinery of Nisan-Wigderson designs immediately yields Theorem 2.3.
Proof sketch of Theorem 2.9.For the proof of Theorem 2.9, we follow the same outline as above to conclude that every factor f of a polynomial P = ∑ y∈{0,1} m Q(x, y) can be written as where B has a circuit of size poly(d) and degree d and each polynomial g i can be expressed as ∑ y∈{0,1} m Qi (x, y), where the number of auxiliary variables m and the circuit size of Q are each less than poly(s, n, m, d, r), where s is the circuit size of Q, r is the degree of P. The proof follows from a result of Valiant [40], where he showed that compositions such as B(g 0 (x), g 1 (x), . . ., g d (x)) can be written in the form ∑ y∈{0,1} m Q (x, y) with m and the circuit complexity of Q being poly(s, n, m, d, r).
Note that composing B and g i into the above form is not straightforward since direct replacement of g 0 with Qi might not work. 9For completeness, we include a proof of this using the depth-reduction results in [41].(See Theorem 8.2 and Claim 8.4 and the appendix for the proof.).
We remark that the proof outlined above bounds the complexity of the factor f once at the end of the lifting, whereas in [7], the authors prove an upper bound on the number of auxiliary variables and the circuit complexity of the verifier circuit for the approximation of the factor of P at the end of each step of the lifting process.They show that in every step of lifting, these parameters grow only by a multiplicative factor of d 2 , and there are O(log d) steps of lifting in total, hence the total blowup of d O(log d) in the process.In contrast, we get a polynomial upper bound on the blowup in the number of auxiliary variables, and the circuit size of the verifier circuit for the factor f , by a one step analysis.
Another crucial point to note is that Theorem 2.9 also follows if in the approach outlined above, we replace our structure theorem for the structure of low-degree factors by an analogous statement in [8] and [29].This is because, the degree of the factor we are seeking and the depth of the circuit obtained for the factor do not play a critical role in this proof as long as they are not too large.Thus, closure of VNP under taking factors follows from the results known prior to this work, although as far as we know, this does not seem to have been noticed before.

Preliminaries
We start by setting up some notation and stating some basic definitions and results from prior work which will be used in our proofs.

Notation
• We use boldface letters x, y, z to denote a list of variables.
• For a function s : N → N, we say that s(n) ≤ poly(n), if there are constants n 0 , a ∈ N such that ∀n > n 0 , s(n) ≤ n a .
• For a (multivariate) polynomial P, deg(P) denotes the total degree of P and deg y (P) denotes the degree of P with respect to the variable y.
• We say that a polynomial f is a factor of a polynomial P of multiplicity equal to m, if f m divides P, and f m+1 does not divide P.

Arithmetic circuits
Definition 4.1 (Arithmetic circuits).An arithmetic circuit Ψ over a field F and variables x = (x 1 , . . ., x n ) is a directed acyclic graph, the vertices of which we refer to as gates.The gates of in-degree zero (or input gates) are labeled by elements in F and variables in x, and the internal gates are labeled by + (sum gates) or × (product gates).The gates of out-degree zero in Ψ are called output gates.The circuit Ψ computes a polynomial in F[x] in a natural way: the input gates compute the polynomial equal to its label.
A sum gate computes the polynomial equal to the sum of the polynomials computed at its inputs, while a product gate computes the polynomial equal to the product of the polynomials computed at its inputs.
For an arithmetic circuit Ψ, we use size(Ψ) to denote the number of edges in Ψ.The depth of Ψ is the length of the longest path from any input gate to any output gate.Throughout this paper, we assume that all our circuits are layered with alternating layers of addition and multiplication gates, with the input gates forming the bottom layer and the output gates forming the top layer.The directed edges should be thought of as pointing upward.Moreover, we always assume that the top layer is of addition gates.For instance, a depth-3 circuit is of the form ∑ ∏ ∑ and a depth-4 circuit is of the form ∑ ∏ ∑ ∏.Let P be the polynomial computed by Ψ.For every k ∈ N, we use

Taylor's expansion
A crucial tool for our proofs is the following classical lemma.
where, for every k, P () (y) = ∂ k P(y) ∂ y k is the derivative of order k of P with respect to y.At a later point in the paper, we work with multivariate polynomials P(x, y) ∈ F[x, y] and view them as univariate polynomials in y with the coefficients coming from the field of fractions F(x).In this case, the derivatives P (k) (y) = ∂ k P ∂ y k as defined above are elements of F[x].

Depth reduction
We will use the following depth-reduction theorems as a black boxes for our proofs.The first result is by Agrawal-Vinay [1], Koiran [20], and Tavenas [39].Invoked with k = 2 the above theorem gives a circuit of depth 4 for the polynomial P of size s O( d) .The next depth-reduction result, due to Gupta, Kamath, Kayal, and Saptharishi [12], gives a further reduction to depth 3, as long as the field is of characteristic zero, and will be useful for our proof.We will also need the following two results which give formula upper bounds for polynomials with small circuits.The results immediately follow from a classical depth-reduction result of Valiant, Skyum, Berkowitz, and Rackoff [41].

√
Theorem 4.5 (Valiant et al.).Let P(x) be a polynomial of degree d in n variables which can be computed by a circuit C of size s.Then, P can also be computed by a homogeneous circuit C of size poly(s, n, d), with the following properties.
• Every product gate in C has fan-in at most 5.
• For every product gate g in C , the degree of the polynomial computed by any child of g is at most half of the degree of the polynomial computed at g.
• C has alternating layers of sum and product gates, where the sum fan-ins can be unbounded.
Theorem 4.6 (Valiant et al.).Let P(x) be a polynomial of degree d in n variables which can be computed by a circuit of size s.Then, P can also be computed by a formula of size (sn) O(log d) .

Explicit family of polynomials
The following definition is from Dvir, Shpilka, and Yehudayoff [8].for every m.Then, the family { f m } is said to be explicit if the following two conditions hold.
• All the coefficients of f m have bit complexity polynomial in m.
• There is an algorithm which on input m outputs the list of all 2 m coefficients of f m in time 2 O(m) .

Extracting homogeneous components
For our proofs, we will also rely on the following classical result of Strassen, which shows that if a polynomial P has a small circuit, then all its low-degree homogeneous components also have small circuits.
Theorem 4.8 (Homogenization).Let F be any field, and let Ψ ∈ F[x] be an arithmetic circuit of size s.Then, for every k ∈ N, there is a homogeneous circuit Ψ k of formal degree k and size O(k 2 s), such that Theorem 4.8 gives us a way of extracting homogeneous components of the polynomial computed by a given circuit.We also need the following related well known lemma, whose proof we briefly sketch.THEORY OF COMPUTING, Volume 15 (13), 2019, pp.1-34 Lemma 4.9 (Interpolation).Let F be any field with at least d + 1 elements.Let P(x, y) ∈ F[x, y] be a polynomial of degree d.Let C 0 (x),C 1 (x), . . .,C d (x) ∈ F[x] be polynomials such that P(x, y) = ∑ d j=0 y d • C j (x).Then, if P(x, y) has a circuit of size s and depth ∆, then for every j ∈ {0, 1, . . ., d}, C j (x) has a circuit of size O(sd) and depth ∆.
Proof sketch.For the proof, we view P as a univariate polynomial of degree d in y with coefficients in the ring F[x].Thus, each C j can be written as an appropriate linear combination of P(x, α 0 ), P(x, α 1 ), . .., P(x, α d ), where α 0 , α 1 , . . ., α d are distinct elements of the field F. Observe that for every α ∈ F, P(x, α) has a circuit of size s and depth ∆.To compute C j , we have to take an appropriate linear combination of these circuits, but the linear combination can be absorbed in the top sum gate, and hence this process does not incur an increase in depth, while the size grows by a factor of at most d.
The following corollary of Lemma 4.9 would also be useful for us.The proof follows immediately from the proof of Lemma 4.9.
Lemma 4.10 (Interpolation for formulas).Let F be any field with at least d + 1 elements.Let P(x, y) ∈ F[x, y] be a polynomial of degree d.Let C 0 (x),C 1 (x), . . .,C d (x) ∈ F[x] be polynomials such that P(x, y) = ∑ d j=0 y d • C j (x).Then, if P(x, y) has a formula of size s, then for every j ∈ {0, 1, . . ., d}, C j (x) has a formula of size O(sd).

Hitting sets
Definition 4.11.A set of points P is said to be a hitting set for a class C of circuits, if for every C ∈ C which is not identically zero, there is an a ∈ P such that C(a) = 0.
Clearly, deterministic and efficient construction of a hitting set of small size for a class C of circuits immediately implies a deterministic PIT algorithm for C. PIT algorithms designed in this way are also black-box, in the sense that they do not have to look inside into the wiring of the circuit to decide if it computes a polynomial which is identically zero.The PIT algorithms in this paper are all black-box in this sense.

Nisan designs
We state the following well known result of Nisan [26] on the explicit construction of combinatorial designs.
Theorem 4.12 (Nisan).Let n, m be positive integers such that n < 2 m .Then, there is a family of subsets S 1 , S 2 , . . ., S n ⊆ [ ] with the following properties.
• For each i, j ∈ [n], such that i = j, S i ∩ S j ≤ log n.
Moreover, such a family of sets can be constructed via a deterministic algorithm in time poly(n, 2 ).

The Polynomial Identity Lemma
We now state the well-known Polynomial Identity Lemma. 10emma 4.13 (Polynomial Identity Lemma).Let F be a field, and let P ∈ F[x] be a non-zero polynomial of degree (at most) d in n variables.Then, for any finite set S ⊂ F we have |{a ∈ S n : P(a) = 0}| ≤ d|S| n−1 .
In particular, if |S| ≥ d + 1, then there exists some a ∈ S n satisfying P(a) = 0.This gives us a brute force deterministic algorithm, running in time (d + 1) n , to test if an arithmetic circuit computing a polynomial of degree d in n variables is identically zero.

Low-degree roots of polynomials with shallow circuits
In this section, we prove Theorem 2.5, which is also our main technical observation.We start with the following lemma, which gives us a way of approximating the root of a polynomial to higher and higher accuracy, in an iterative manner.The lemma is a standard example of Hensel Lifting, which appears in many of the prior works in this area including [8].The statement and the proof below, are from Dvir et al. [8].
Lemma 5.1 (Hensel Lifting [8]).Let P ∈ F[x, y] and f ∈ F[x] be polynomials such that P(x, f ) = 0 and H 0 Proof.For the rest of the proof, we think of P(x, y) as an element of F[x][y].Henceforth, we drop the variables x everywhere, and think of P as a univariate in y.Thus, P(y) = P(x, y).For brevity, we denote H j [ f ] by f j for every j ∈ N.
From the hypothesis, we know that P( f ) = 0. Therefore, We first observe that if f i = h i , then H ≤i [P(h)] = 0, and the lemma is trivially true.So, for the rest of the argument, we assume that f i − h i = 0. Now, by using Lemma 4.2, we get the following equality.0 = H ≤i P(h) + P (1) Here, r denotes the degree of P. Since f i − h i is non-zero, and every monomial in f i − h i has degree equal to i, any term in the above summand which is divisible by ( f i − h i ) 2 does not contribute any monomial of degree at most i.Thus, we have the following.
Since H ≤i−1 [P(h)] is identically zero, we get, For our proof, we shall look at the structure of the outcome of the lifting operation in Lemma 5.1 more closely.Before proceeding further, we need the following crucial lemma.Lemma 5.2.Let P(x, y) ∈ F[x, y] be a polynomial of degree r, let α ∈ F be a field element and d ∈ N be a positive integer.Let G y (P, α, d) be the set of polynomials defined as follows.
Let G y (P, α, d) be the subset of G y (P, α, d) consisting of all non-zero polynomials.Then, the following statements are true.
• For every g ∈ G y (P, α, d), the degree of every non-zero monomial in g is at least 1 and at most d.
• If P has a circuit of size s and depth ∆, then every g ∈ G y (P, α, d) has a circuit of size O(sr 4 ) and depth ∆.
Proof.The first two items follow immediately from the definition of G y (P, α, d).We focus on the proof of the third item.Let C 0 (x),C 1 (x), . . .,C r (x) be polynomials such that THEORY OF COMPUTING, Volume 15 (13), 2019, pp.1-34 Now, for any j ∈ {0, 1, 2, . . ., d}, by Lemma 4.2, ∂ j P ∂ y j (x, y) is a scalar multiple of the coefficient of z j in P(x, y + z).Moreover, Thus, for every j ∈ {0, 1, . . ., d}, the coefficient of z j in P(x, y + z) is given by ∑ r i= j i j C i (x) • y i− j .From Lemma 4.9, we know that each C i (x) has a circuit of depth ∆ and size at most O(sr).Thus, we can obtain a circuit for i j C i (x) • y i− j by adding an additional layer of × gates on top of the circuit for C i (x).This increases the size by an additive factor of r, and the depth by 1.However, observe that this increase in depth is not necessary.Since, an expression of the form ).Thus, the multiplication by y i can be absorbed in the product layer below the topmost layer of the circuits for C i (x), and this does not incur any additional increase in size.Thus, the polynomials ∂ j P ∂ y j (x, y), and hence ∂ j P ∂ y j (x, α) have a circuit of size O(sr 3 ) and depth ∆.To compute the homogeneous components of these polynomials of degree at most d, we use Lemma 4.9.This increases the size by a factor of at most O(r 2 ) while keeping the depth the same.
We now state our key technical observation.d) be denoted by g 0 , g 1 , . . ., g d .Then, for every i ∈ {1, 2, . . ., d}, there is a polynomial A i (z) in d + 1 variables such that the following are true.
, and This is an analog of the main technical lemma in [8], which we state below.
The difference between these lemmas is that in [8], it is shown that there is a set of polynomials of size deg y (P) + 1 which generate every homogeneous component of the root f .Thus, in the regime of bounded individual degree, the size of this generating set is very small.However, when deg y (P) ≥ n, Lemma 5.4 THEORY OF COMPUTING, Volume 15 (13), 2019, pp.1-34 does not say anything non-trivial since f can be trivially written as a polynomial in the n original variables.In contrast, Lemma 5.3 continues to say something non-trivial, as long as d n, regardless of the value of deg y (P).We now proceed with the proof.
Proof of Lemma 5.3.For the rest of the proof, we think of P(x, y) as an element of F[x][y].So, we drop the variables x everywhere, and think of P as a univariate in y.Thus, P(y) = P(x, y).For brevity, we denote H j [ f ] by f j for every j ∈ N. We also use G y for G y (P, f 0 , d).The proof will be by induction on i and crucially use Lemma 5.1.
• Base case.We first prove the lemma for i = 1.We invoke Lemma 5.1 with i = 1 and h = f 0 .We get that The proof follows by observing that f 0 , δ are constants and • Induction step.We assume that the claim in the lemma holds up to homogeneous components of degree at most i − 1, and argue that it holds for H ≤i [ f ].We invoke Lemma 5.1 with h = A i−1 (g 0 , g 1 , . . ., g d ), which exists by the induction hypothesis.
Recall that H 0 (h) = H 0 ( f ).Thus, h = f 0 + h, where the constant term of h is 0 and thus every monomial has degree at least 1.By Lemma 4.2, Thus, as h has degree at least 1, we have Since we are only interested in i ≤ d, the following equality is also true.
Observe that for every j ∈ {0, 1, . . ., d}, H ≤d P ( j) ( f 0 ) is an affine form in the elements of G. 11 For every j ∈ {0, 1, 2, . . ., i}, let j (z) be an affine form such that j (g 0 , g 1 , . . ., g d ) = H ≤d P ( j) ( f 0 ) .Now, we define A i (z) as The first item in the statement of the lemma is true, just by the definition of A i (z) above.We now argue about the circuit size of A i (z).Each affine form i (z) can be computed by a circuit of size O(d).Thus, given a circuit of A i−1 (z), we can obtain a circuit for A i (z) by adding at most 10d 2 additional gates.Thus, A i (z) can be computed by a circuit of size at most 10d 2 (i − 1) + 10d 2 = 10d 2 i gates.
The following is an easy corollary of Lemma 5.3.
Corollary 5.5.Let P(x, y) ∈ F[x, y] be a polynomial of degree, and α ∈ F be such that P(0, α) = 0, and and H ≤k [P(x, h k (x))] = 0.Moreover, if the polynomials in the set G y (P, α, k) be denoted by g 0 , g 1 , . . ., g k .Then, there is a polynomial A k (z) in k + 1 variables such that the following are true.
, and The lemma initially starts with an α ∈ F such that α is a root of multiplicity 1 of P(0, y).And, starting from this α, we can lift uniquely to a polynomial h i which is an approximate root of the polynomial P.This corollary will be useful later on in the paper, when we study the structure of factors of P which are not linear in y.And, the uniqueness will be important for this.
We are now ready to complete the proof of Theorem 2.5.
Proof of Theorem 2.5.The first step is to massage the circuit for P so that the hypothesis of Lemma 5.3 holds.We will have to keep track of the size and depth blowups incurred in the process.We begin by ensuring that f is a root of multiplicity 1 of some polynomial related to P.
Reducing multiplicity of the root f .Let P(x, y) = ∑ r i=0 y i C i (x).Let m ≥ 1 be the multiplicity of f as a root of P(x, y).Thus, The idea is to just work with the polynomial P = ∂ m−1 P ∂ y m−1 (x, y) for the rest of the proof.Clearly, f is a root of multiplicity exactly 1 of P. We only need to ensure that P can also be computed by a small shallow circuit.This follows from the proof of the third item in Lemma 5.2, where we argued that ∂ j P ∂ y j (x, y) has a depth-∆ circuit of size poly(s, r).
Translating the origin.From the step above, we can assume without loss of generality that ∂ P ∂ y (x, f ) = 0. Thus, there is a point a ∈ F n such that ∂ P ∂ y (a, f (a)) = 0.By translating the origin, we will assume that ) = 0.This increases the depth of the circuit by at most 1, as it could involve replacing every variable x i by x i + a i , and the size by a factor of at most n.
Degree of A d .From Lemma 5.3, we know that the polynomial A d (z) has a circuit of size O(d 3 ).To obtain a circuit for f , we first prune away all the homogeneous components of A d (z) of degree larger than d.Recall that by definition, for every polynomial g i ∈ G y , every non-zero monomial in g i has degree at least 1, and that f = H ≤d [A d (g 1 , g 2 , . . ., g d )].Thus, any monomial of degree strictly greater than d in A d (z) contributes no monomial of degree at most d in the variables x in the composed polynomial A d (g 1 , g 2 , . . ., g d ), and hence does not contribute anything to the computation of f .So, we can confine ourselves to working with the homogeneous components of A d (z) of degree at most d.By Theorem 4.8, we know that given a circuit for A d (z), we can construct a circuit for H i [A d (z)] by increasing the size of the circuit by a multiplicative factor of at most O(i 2 ).Thus, H ≤d [A d (z)] can be computed by a circuit of size O(d 3 ) × size(A d (z)).Thus, for the rest of this argument, we will assume that A d (z) has a circuit of size O(d 6 ) and degree at most d, and Shallow circuit for f .Composing the ∑ ∏ ∑ circuit Ψ for A d (z) with the circuits of g 1 , . . ., g d ∈ G y , we get a circuit Ψ with the following properties.
• The size of Ψ is (srn • The depth of Ψ is at most ∆ + 3.This follows by combining the bottom ∑ layer of the ∑ ∏ ∑ circuit for A d (z) with the top ∑ layer of the circuits for g i ∈ G y .
• The degree of Ψ is at most d 2 .This is true because the degree of A d (z) is at most d (as argued earlier in this proof), and the degree of every polynomial in G y is at most d (first item in Theorem 5.2).
Note that for any k ∈ N, we can get Ψ of size (srn) 10 • d O(d 1/k ) ) and of depth at most ∆ + O(k).
To obtain a circuit for f , we apply Lemma 4.9 to Ψ .This increases the size of Ψ by a multiplicative factor of at most O(d 2 ), while the depth remains the same.This completes the proof of the theorem.

From roots to arbitrary factors
In this section, we show that Theorem 2.5 essentially generalizes to arbitrary factors, and not necessarily factors of the form y − f (x), up to some loss in the size and depth parameters.The techniques for this generalization are quite standard and well known in this literature, and our presentation here follows the approach of Oliveira [29].We sketch the main steps towards obtaining circuits for arbitrary factors.
Making the polynomial monic in y.Starting with an arbitrary polynomial P(x, y), we first make sure that it is monic in y.We do this by taking an invertible linear transformation x i → x i + a i • y, where the vector a is chosen randomly from some large enough grid.Indeed, assume that deg(P) = r.Let us consider the homogeneous component of degree r of P(x, y).Since H r [P(x, y)] is homogeneous in (x, y) of degree r, so H r [P(x, y)] = P r (x/y, 1) • y r for a polynomial P r , implying that P r (x/y, 1) is not the zero polynomial, so we can write P(x + ay, y) = P r (a, 1)y r + lower order terms (in y) .
By Lemma 4.13, there exists some a ∈ [r + 1] n , with P r (a, 1) = 0. Thus, in the inverted coordinate system, the leading coefficient of P(x + ay, y) (as a polynomial in y), is some non-zero element of the field F, and, without loss of generality, we can take it to be 1.
If P(x, y) is monic, then so are its factors.To see this, first recall the Gauss Lemma.We shall use it to deduce that the factors of P(x, y) are elements in F[x, y].Thus, for the rest of this section, we will assume that all the factors of P(x, y) are also monic in y.
Working over the algebraic closure of F(x).As above, P is monic in y with deg y (P) = r, that is, Assume that P does not factor into linear factors in y, and that f (x, y) is one of its factors, of degree k in y.Since P is monic in y, we know that f must also be monic in y.Working over the algebraic closure of F(x) (that is, the field F(x)), we can factor P (and f ) into linear factors in y.The algebraic closure of F(x) is a complicated object, but we only need to think of elements of the closure as "functions" over the variables in x.Since f divides P , if without loss of generality, assume the first d of these ϕ i correspond to roots of f , so we have We note that ϕ i (x) may not be polynomials in x. 12 Still, the fact that they share some roots in the closure of F(x) gives us a way to approximate them, using Hensel's lifting, similar to Lemma 5.3.For the rest of our argument, we first need to ensure some non-degeneracy conditions.
Reducing the multiplicity of f in P. We first make sure that f is a factor of multiplicity 1 of P; if f is a factor of multiplicity m > 1, we can replace P by P = ∂ m−1 P ∂ y m−1 (x, y).Clearly, f is a factor of multiplicity exactly 1 of P. Ensuring that P can also be computed by a small shallow circuit, follows from the proof of the third item in Lemma 5.2, where we argued that ∂ j P ∂ y j (x, y) has a depth-∆ circuit of size O(sr 3 ).So, for the rest of the proof, we will assume that f is a factor of P of multiplicity equal to 1.
Properly separating shifts.To proceed further, we want a shift in x such that each factor has no repeating roots in y and distinct factors share no root in y.This follows from the below lemma from [29], which we state without a proof.Lemma 6.2 ([29, Lemma 3.6]).Let f (x, y), g(x, y) ∈ F[x, y] be polynomials such that deg y ( f ) ≥ 1, deg y (g) ≥ 1, f is irreducible and f does not divide g.Then, there is a c ∈ F n such that • f (c, y) is a polynomial with exactly deg y ( f ) distinct roots in F, and • f (c, y) and g(c, y) have no common roots.
Now, let us consider the polynomial g = P/ f .Since f is factor of multiplicity 1 of P, f does not divide g.From Lemma 6.2, we know that there is a c ∈ F n such that f (c, y) and g(c, y) do not share a root, and all the roots of f (c, y) are distinct.At the cost of increasing the depth of the circuit of P by 1, we can assume without loss of generality that c is the origin.So, for the rest of the proof, we assume that f (0, y) has no repeating roots, and f (0, y) and g(0, y) share no common roots.Let α 1 , α 2 , . . ., α r be the roots of P(0, y) and let α 1 , α 2 , . . ., α d be the roots of f (0, y).
Approximating the roots of P. The goal of this step is to approximate the roots of P by low-degree polynomials with small circuits.From the previous paragraph, we know that for i ∈ [d], P(0, α i ) = 0 and ∂ P ∂ y (0, α i ) = 0. Thus, from Corollary 5.5, there are polynomials q 1 , q 2 , . . ., q d of degree at most d such that for every i ∈ [d], there is a polynomial A i,d (z) in d + 1 variables such that the following are true.
• q i (0) = α, and Here, for every i ∈ [d], g i,0 , g i,1 , . . ., g i,d are the polynomials in the set G y (P, α i , d).Thus, we have degree d polynomials, which are approximations of the roots of P, the constant terms of these polynomials agree with the roots of f (x, 0) and these approximate roots have small shallow circuits.Moreover, We will now combine these approximations to obtain a circuit for f .Obtaining a circuit for f .In this final step, we are going to obtain circuit for f from the polynomials q 1 , q 2 , . . ., q d in the previous step.The first observation is that the q 1 , q 2 , . . ., q d are also approximate roots of the polynomial f .To see this, observe that by our choice, α 1 , α 2 , . . ., α d are distinct roots of f (0, y).Thus, for each i ∈ [d], f (0, α i ) = 0 and ∂ f ∂ y (0, α i ) = 0. Thus, by Corollary 5.5, there are degree d polynomials q1 , q2 , . . ., qd of degree at most d such that H ≤d [ f (x, qi (x))] = 0. Thus, we also have H ≤d [P(x, qi (x))] = 0. So, by the uniqueness condition in Corollary 5.5, we get that the set of polynomials { qi : i ∈ [d]} must be the same as Next, to obtain a circuit for f , we now claim that The proof of this fact follows immediately from Lemma 5.4 in [29].We state a special case, which suffices for our application.
Lemma 6.3 ([29, Lemma 5.4]).Let P(x, y) and f (x, y) be polynomials of degree r and d respectively, such that P and f are monic in y, f is a factor of P and all the roots of f (0, y) are distinct and roots of multiplicity exactly one of P(0, y).Let α 1 , α 2 , . . ., α d be the roots of f (0, y) and let q 1 , q 2 , . . ., q d ∈ F[x] be polynomials of degree at most d such that for every i ∈ [d], • q i (0) = α i , Thus, given the circuits for q i (x), we can obtain a circuit for f (x, y) by increasing the depth by at most two (a product layer, and then a sum layer for interpolation), and size by a poly(d) factor.In summary, we have the following two statements.Lemma 6.4.Let P ∈ F[x, y] and f ∈ F[x, y] be polynomials of degree r and d respectively such that P is monic in y and f is an irreducible factor of P.Then, there exist c ∈ F n , α 1 , α 2 , . . ., α d ∈ F and a polynomial B(z) of degree at most d in t = O(d 2 ) variables, such that the following are true.
• B(z) is computable by a circuit of size poly(d).Theorem 6.5.Let P ∈ F[x, y] be a polynomial of degree r in n + 1 variables that can be computed by an arithmetic circuit of size s of depth ∆.Let f ∈ F[x, y] be an irreducible polynomial of degree d such that f divides P.Then, f can be computed by a circuit of depth ∆ + O(1) and size poly(s, r, n) and Q can be computed by a circuit of size s.Let f be an irreducible factor of P of degree d.Then, there exists an m ≤ poly(s, r, d, n, m) and polynomial h(x, z 1 , z 2 , . . ., z m ), such that h(x, z) can be computed by a circuit of size s ≤ poly(s, r, d, n, m) and For our proof, we use the following structure theorem of Valiant [40], and its consequences (Claim 8.4).Below, we state the theorem, and then use it to prove Theorem 8.1.For completeness, we include a proof using the depth-reduction results in [41] in the appendix.Theorem 8.2 (Valiant [40]).Let P(x) be a homogeneous polynomial of degree r in n variables that can be computed by an arithmetic circuit C of size s.Then, there is an m ≤ poly(s, r) and a polynomial Q(x, y 1 , y 2 , . . ., y m ) such that and Q(x, y) can be computed by an arithmetic formula of size poly(s, r).
We now proceed with the proof of Theorem 8.1.
Proof of Theorem 8.1.Without loss of generality, we will assume that P is monic in a variable z.This can be guaranteed by doing a linear transformation by replacing every variable x i by x i + a i z, where a i are chosen from a large enough grid, based on the degree of P. Note that this preserves the form of P in the hypothesis of the theorem.Moreover, using Theorem 4.8, we will assume that the degree of Q(x, y) in the variables x and z is r, up to a polynomial blowup in the circuit size of Q.
From Lemma 6.4, we know that there is a c ∈ F n and a polynomial B in at most t = O(d 2 ) variables, and polynomials g 1 , g 2 , . . ., g t such that f (x + c, z) = H ≤d [B(g 1 , g 2 , . . ., g t )] .
For the rest of this proof, we assume that we have shifted the origin, so that c = 0. Again, this just requires replacing every variable x i by x i + c i , and this shift of coordinates does not affect the structure of P in the hypothesis of the theorem.Thus, f (x, z) = H ≤d [B(g 1 , g 2 , . . ., g t )] .
Moreover, B has a circuit of size poly(d) and each g i belongs to some set G z (P, α, d) for some α ∈ F. We now need the following two structural claims which follow from direct applications of properties of polynomials in VNP as shown by Valiant [40].Claim 8.3 (Valiant [40]).For every choice of α ∈ F and g j ∈ G z (P, α, k), there is a polynomial Q j (x, y 1 , y 2 , . . ., y m ) such that g j (x) = ∑ y∈{0,1} m Q j (x, y) .
Moreover, Q can be computed by a circuit of size poly(s, r, d).
Moreover, Q can be computed by a circuit of size poly(s, r, d, n, m).
For completeness, we provide a sketch of the proofs of the claims and that of Theorem 8.2 to the appendix.We now use the claims above to complete the proof of Theorem 8.1.
Observe that if we view Q as a polynomial in x variables with coefficients coming from F[y], then, for every k ∈ N, it follows that H k [B(g 1 , g 2 , . . . ,g t )] = ∑ y∈{0,1} m H k,x Q(x, y) .
Here, H k,x [ Q(x, y)] denotes the homogeneous component of degree k of Q(x, y) when viewing Q(x, y) as a polynomial in x variables.It follows from Theorem 4.8, that by blowing up the size of the circuit for Q by a factor of O(k 2 ), we can obtain a circuit which computes H k,x [ Q(x, y)], and this does not affect the y variables in any way.This gives us a representation of f (x, z) as where m = m ≤ poly(m, d), and h can be computed by a circuit of size poly(s, r, d, n, m).This completes the proof of the theorem.

Factors of polynomials with small formulas
In this section, we prove the following theorem, which gives an upper bound on the formula complexity of factors of polynomials which have small formulas.We note that this result is not new and was also proved by Dutta et al. in [7].Since the proof essentially follows from our techniques developed so far and our proof is different from the proof in [7], we include the statement and a proof sketch.Theorem 9.1 ( [7]).Let P(x) be a polynomial of degree r in n variables which can be computed by an arithmetic formula of size s, and let f (x) be a factor of P of degree d.Then, f (x) can be computed by an arithmetic formula of size poly(s, r, n, d O(log d) ).
Proof.The proof is again along the lines of the proof of Theorem 8.1.We first observe that the polynomials in G y (P, α, k) have small formulas.This just follows from the proof of Item 3 in Lemma 5.2 and Lemma 4.10.Now, recall that from Lemma 6.4, we know that the B is a polynomial in at most O(d 2 ) variables, and can be computed by a circuit of size poly(d).Thus, by Theorem 4.6, we get that B can be computed by a formula Φ of size at most d O(log d) .Composing Φ with the formulas for the polynomials in G y (P, α, k), we get a formula for B(g 1 , g 2 , . . ., g t ) of size poly(r, s, m, n, d O(log d) ), and also, f = H ≤d [B(g 1 , g 2 , . . ., g t )] .
All we need now to complete the proof, is a formula for H ≤d [B(g 1 , g 2 , . . ., g t )], and this follows from Lemma 4.10.
We remark that the proof extends to the model of algebraic branching programs.More precisely, the following statement is true.Theorem 9.2.Let P(x) be a polynomial of degree r in n variables which can be computed by an algebraic branching program of size s, and let f (x) be a factor of P of degree d.Then, f (x) can be computed by an algebraic branching program of size poly(s, r, n, d O(log d) ).

Proofs of claims
We now include the proofs of Theorem 8.2, Claim 8.3 and Claim 8.4.We follow the notation set up in the proof of Theorem 8.1.
Proof of Claim 8.3.We relabel one of the variables in x as z.Let C 0 (x),C 1 (x), . . .,C r (x) be polynomials such that Recall that ∂ j P ∂ z j (x, z) equals j! • ∑ r i= j i j C i (x) • z i− j .Now, we know that P(x, z) = ∑ y∈{0,1} m Q(x, y, z) .
Expressing Q(x, y, z) as a univariate in z, we get Recall that Q(x, y, z) has a circuit of size poly(s) and degree at most r.By viewing Q as a univariate in z and applying Theorem 4.8, we get that each C i (x, y) has a circuit of size poly(s, r).In particular, for every j ∈ N, we can write C j (x) as Therefore, for every j ∈ {0, 1, 2, . . ., d}, we get Moreover, the polynomial ∑ r i= j i j C i (x, y) • z i− j has a circuit of size poly(n, r).This completes the proof of the claim.

Theorem 2 . 1 .
Let F be a field of characteristic zero.Let P ∈ F[x] be a polynomial of degree r in n variables that can be computed by an arithmetic circuit of size s and depth ∆.Let f ∈ F[x] be an irreducible polynomial of degree d such that f divides P. Then f can be computed by a circuit of depth ∆ + O(1) and size poly(s, r, n) • d O( √ d) .Furthermore, for any k ∈ N, f can be computed by a circuit of depth ∆ + O(k) and size poly(s, r, n) • d O(d 1/k ) .
's closure result for VP does not seem to immediately extend to VNP, and whether or not the factors of polynomials in VNP are in VNP was an open question.Bürgisser made the following conjecture [4, Conjecture 2.1].Conjecture 2.7 (Bürgisser).The class VNP is closed under taking factors.
of size poly(n).Thus, from Theorem 4.4, it follows that f has a circuit of depth-3 of size n O( √ d) .The key advantage of Theorem 2.5 over this bound is that the exponential term is d O( √ d) and not of the form n d ε .For d ≤ log 2 n/ log 2 log n, d O( √d) is bounded by a polynomial in n and so the final bound is poly(n).

Theorem 4 . 3 (
Depth reduction to depth (2k)).Let k be a positive integer and F be any field.If P(x) ∈ F[x] is an n-variate polynomial of degree d that be computed by an arithmetic circuit Ψ of size s, then P can be computed by a depth-(2k) circuit of size (snd) O(d 1/k ) .

Theorem 4 . 4 (
Depth reduction to depth 3).Let F be a field of characteristic zero.Let P(x) ∈ F[x] be an n-variate polynomial of degree d that can be computed by an arithmetic circuit Ψ of size s.Then, P can be computed by a ∑ ∏ ∑ circuit of size (snd) O( √ d) .

Lemma 5 . 3 .
Let P ∈ F[x, y] and f ∈ F[x] be polynomials of degree r and d respectively such that P(x, f ) = 0 and H 0 ∂ P ∂ y (x, f (x)) = δ = 0. Let the polynomials in the set G y (P, H 0 [ f ],

Lemma 5 . 4 ([ 8 ,
Lemma 3.1]).Let P ∈ F[x, y] and f ∈ F[x] be polynomials of degree r and d respectively such that P(x, f ) = 0 and H 0 has a circuit of size O(d 6 ) and degree at most d, by Theorem 4.4, we know that A d (z) can be computed by a ∑ ∏ ∑ circuit Ψ of size at most d O( √ d) .Similarly, by Theorem 4.3, we know that for any k ∈ N, A d (z) can be computed by a depth-(2k) circuit of size d O(d 1/k ) .

Lemma 6 . 1 (
Gauss Lemma).Let R be a Unique Factorization Domain with a field of fractions F and let f (y) ∈ R[y].If f (y) is reducible over F[y], then f is reducible over R[y].Now, we have the following simple observation.Observation 1.Let R = F[x].If P ∈ R[y]is a monic polynomial in y, and P = g • h, where g, h ∈ R[y], then the leading coefficients of g and h in y belong to F \ {0}.