Identity Testing for constant-width, and commutative, read-once oblivious ABPs

We give improved hitting-sets for two special cases of Read-once Oblivious Arithmetic Branching Programs (ROABP). First is the case of an ROABP with known variable order. The best hitting-set known for this case had cost $(nw)^{O(\log n)}$, where $n$ is the number of variables and $w$ is the width of the ROABP. Even for a constant-width ROABP, nothing better than a quasi-polynomial bound was known. We improve the hitting-set complexity for the known-order case to $n^{O(\log w)}$. In particular, this gives the first polynomial time hitting-set for constant-width ROABP (known-order). However, our hitting-set works only over those fields whose characteristic is zero or large enough. To construct the hitting-set, we use the concept of the rank of partial derivative matrix. Unlike previous approaches whose basic building block is a monomial map, we use a polynomial map. The second case we consider is that of commutative ROABP. The best known hitting-set for this case had cost $d^{O(\log w)}(nw)^{O(\log \log w)}$, where $d$ is the individual degree. We improve this hitting-set complexity to $(ndw)^{O(\log \log w)}$. We get this by achieving rank concentration more efficiently.


Introduction
The polynomial identity testing (PIT) problem asks if a given multivariate polynomial is identically zero. The input to the problem is given via an arithmetic model computing a polynomial, for example, an arithmetic circuit, which is the arithmetic analogue of a Boolean circuit. The degree of the given polynomial is assumed to be polynomially bounded in the circuit size. Typically, any such circuit can compute a polynomial with exponentially many monomials (exponential in the circuit size). Thus, one cannot hope to write down the polynomial in a sum-of-monomials form. However, given such an input, it is possible to efficiently evaluate the polynomial at a point in the field. This property enables a randomized polynomial identity test with one-sided error. It is known that evaluating a small-degree nonzero polynomial over a random point gives a nonzero value with a good probability [?, ?, ?]. This gives us a randomized PIT -just evaluate the input polynomial, given as an arithmetic circuit, at random points.
Finding an efficient deterministic algorithm for PIT has been a major open question in complexity theory. The question is also related to arithmetic circuit lower bounds [?, ?, ?]. The PIT problem has been studied in two paradigms: (i) blackbox test, where one can only evaluate the polynomial at chosen points and (ii) whitebox test, where one has access to the description of the input circuit. A blackbox test for a family of polynomials is essentially the same as finding a hitting-set -a set of points such that any nonzero polynomial in that family evaluates to a nonzero value on at least one of the points in the set. This work concerns finding hitting-sets for a special model called read-once oblivious arithmetic branching programs (ROABP).
An arithmetic branching program (ABP) is a specialized arithmetic circuit. It is the arithmetic analogue of a Boolean branching program (also known as a binary decision diagram). It is a directed layered graph, with edges going from a layer of vertices to the next layer. The first and the last layers have one vertex each, called the source and the sink respectively. Each edge of the graph has a label, which is a 'simple' polynomial, for example, a univariate polynomial. For any path p, its weight is defined to be the product of labels on all the edges in p. The ABP computes a polynomial which is the sum of weights of all the paths from the source to the sink. Apart from its size, another important parameter for an ABP is its width. The width of an ABP is the maximum number of vertices in any of its layers. See Definition ?? for a formal definition of ABP.
ABPs are a strong model for computing polynomials. It is known that for any size-s arithmetic circuit with degree bounded by poly(s), one can find an ABP of size quasi-poly(s) computing the same polynomial [?, ?, ?] (see [?] for a complete proof). Even when the width is restricted to a constant, the ABP model is quite powerful. Ben-Or and Cleve [?] have shown that width-3 ABPs have the same expressive power as polynomial sized arithmetic formulas.
An ABP is a read-once oblivious ABP or ROABP if each variable occurs in at most one layer of the edges and every layer has exactly one variable 1 . The read-once property severely restricts the power of the ABP. There is an explicit family of polynomials that can be computed by simple depth-3 (ΣΠΣ) circuits but requires exponential size ROABPs [?] to compute it. The order of the variables in the consecutive layers is said to be the variable order of the ROABP. The variable order affects the size of the minimal ROABP computing a given polynomial. There are polynomials which have a small ROABP in one variable order but require exponential size in another variable order. Nisan [?] gave an exact characterization of the polynomials computed by width-w ROABPs in a certain variable order. In particular, he gave exponential lower bounds for this model 2 .
The question of whitebox identity testing of ROABPs has been settled by Raz and Shpilka [?], who gave a polynomial time algorithm for this. However, though ROABPs are a relatively well-understood model, we still do not have a polynomial time blackbox algorithm. The blackbox PIT question is studied with two variations: one where we know the variable order of the ROABP and the other where we do not know it. For known-order ROABPs, Forbes and Shpilka [?] gave the first efficient blackbox test with (ndw) O(log n) time complexity, where n is the number of variables, w is the width of the ROABP and d is the individual degree bound of each variable. For the unknown-order case, Forbes et al. [?] gave an n O(d log w log n) -time blackbox test. Observe that the complexity of their algorithm is quasi-polynomial only when d is small. Subsequently, Agrawal et al. [?] removed the exponential dependence on the individual degree. They gave an (ndw) O(log n) -time blackbox test for the unknown-order case. Note that these results remain quasi-polynomial even in the case of constant width. Studying ROABPs has also led to PIT results for other computational models, for example, sub-exponential size hitting-sets for depth-3 multilinear circuits [?] and sub-exponential time whitebox test for read-k oblivious ABPs [?].
Another motivation to study ROABPs comes from their Boolean analogues, called read-once ordered branching programs (ROBP) 3 . ROBPs have been studied extensively, with regard to the RL versus L question (randomized log-space versus log-space). The problem of finding hitting-sets for ROABP can be viewed as an analogue of finding pseudorandom generators (PRG) for ROBP. A pseudorandom generator for a Boolean function f is an algorithm which can generate a probability distribution (with a small sample space) with the property that f cannot distinguish it from the uniform random distribution (see [?] for details). Constructing an optimal PRG for ROBP, i.e., with O(log n) seed length or polynomial sized sample space, would imply RL = L. Although the known pseudorandom generators for ROBPs and hitting-set generators for ROABPs in similar settings have similar complexity, there is no known way to translate the construction of one to another. The best known PRG is of seed length O(log 2 n) (n O(log n) size sample space), when variable order is known [?, ?, ?]. On the other hand, in the unknown-order case, the best known seed length is of size n 1/2+o (1) [?]. Finding an O(log n)-seed PRG even for constant-width known-order ROBPs has been a challenging open question. Though, some special cases of this question have been solved -width-2 ROBPs [?], or nearly solved -permutation and regular ROBPs [?, ?, ?, ?, ?].
Our first result addresses the analogous question in the arithmetic setting. We give the first polynomial time blackbox test for constant-width known-order ROABPs. However, it works only for zero or large characteristic fields. Our idea is inspired by the pseudorandom generator for ROBPs by Impagliazzo, Nisan and Wigderson [?]. While their result does not give better PRGs for the constant-width case, we are able to achieve this in the arithmetic setting.
Theorem (Theorem ??). Let C be the class of n-variate, individual degree d polynomials in F[x] computed by a width-w ROABP in the variable order (x 1 , x 2 , . . . , x n ). Then a hitting-set of size dn O(log w) can be constructed for C, when char(F) = 0 or char(F) > ndw log n .
When w < n, the size of our hitting set is smaller than the previously known hitting sets. Furthermore, even in the regime when w ≥ n, the size of our hitting set matches the previously best known hitting sets. We show that for a nonzero bivariate polynomial f (x 1 , x 2 ) computed by a width-w ROABP, the univariate polynomial f (t w ,t w + t w−1 ) is nonzero. For this, we use the notion of rank of the partial derivative matrix of a polynomial, defined by Nisan [?]. Our argument is that the rank of the partial derivative matrix of any bivariate polynomial which becomes zero on (t w ,t w + t w−1 ) is more than w, while for a polynomial computed by a width-w ROABP, this rank is at most w. We use the map (x 1 , x 2 ) → (t w ,t w + t w−1 ) recursively in log n rounds to achieve the above mentioned hitting-set. Our technique has a crucial difference from the previous works on ROABPs [?, ?, ?]. The starting point in all the previous techniques is a monomial map, i.e., each variable is mapped to a monomial. On the other hand, we argue with a polynomial map directly (where each variable is mapped to a univariate polynomial). We believe that our approach could lead to a polynomial sized hitting set for ROABPs and we now describe a concrete construction that we conjecture works. The goal would be to obtain a univariate n-tuple (p 1 (t), . . . , p n (t)), such that any polynomial which becomes zero on (p 1 (t), . . . , p n (t)) must have rank or evaluation dimension higher than w. We conjecture that (t r , (t + 1) r , . . . , (t + n − 1) r ) is one such tuple, where r is polynomially large (Conjecture ??).
We believe that these ideas from the arithmetic setting can help in constructing an optimal PRG for constant-width ROBP.
Our second result is for a special case of ROABPs, called commutative ROABPs. A polynomial f (x) is computed by a width-w commutative ROABP if for every permutation of the variables, there exists an ROABP of width-w that computes f (x) in that variable order. In particular, if in an ROABP all of the paths from the source to the sink are vertex disjoint, then the ROABP is commutative. Note that for a commutative ROABP, knowing the variable order is irrelevant. Commutative ROABPs have slightly better hitting-sets than the general case, but still no polynomial size hitting-set is known. The previously best known hitting-set for them has size d O(log w) (nw) O(log log w) [?]. We improve this to (ndw) O(log log w) .
Theorem (Theorem ??). For n-variate, individual degree d polynomials computed by width-w commutative ROABPs, a hitting-set of size (ndw) O(log log w) can be constructed.
To get this result we follow the approach of Forbes et al. [?], which used the notion of rank concentration or low-support concentration, a technique introduced by Agrawal et al. [?]. We achieve rank concentration more efficiently using the basis isolation technique of Agrawal et al. [?]. The same technique also yields a more efficient concentration in depth-3 set-multilinear circuits (see Section ?? for the definition). However, it is not clear if it gives better hitting-sets for them. The best known hitting-set for them has size n O(log n) [?].

Definitions and Notations
[n] denotes the set {1, 2, . . . , n}. [[d]] denotes the set {0, 1, . . . , d}. x will denote a set of variables, usually the set {x 1 , x 2 , . . . , x n }. F[x] denotes the ring of polynomials over the field F. F(t) denotes the field of rational functions over the field F. For a set of n variables x and for an exponent a = (a 1 , a 2 , . . . , a n ) ∈ {0, 1, 2, . . . } n , x a will denote the monomial ∏ n i=1 x a i i . The support of a monomial x a , denoted by Supp(a), is the set of variables appearing in that monomial, i.e., {x i | i ∈ [n], a i > 0}. The support size of a monomial is the cardinality of its support, denoted by supp(a). A monomial is said to be -support if its support size is and (< )-support if its support size is < . For a polynomial P(x), the coefficient of a monomial x a in P(x) is denoted by coef P (x a ).
For a monomial x a , ∑ i a i is said to be its degree and a i is said to be its degree in variable x i for each i. Similarly, for a polynomial P, its degree (or degree in x i ) is the maximum degree (or maximum degree in x i ) of any monomial in P with a nonzero coefficient. We define the individual degree of P to be To better understand polynomials computed by ROABPs, we often use polynomials over an algebra A, i.e., polynomials whose coefficients come from A. Matrix algebra is the vector space of matrices equipped with the matrix product. F m×n represents the set of all m × n matrices over the field F. Note that the algebra of w × w matrices, has dimension w 2 .
We often view a vector/matrix with polynomial entries, as a polynomial with vector/matrix coefficients. For example, Here, the coef D operator will return a matrix for any monomial, for example, coef D (y) = 0 1 1 0 . For a polynomial D(x) ∈ A[x] over an algebra, its coefficient space is the space spanned by its coefficients. For a matrix R, R(i, j) denotes its entry in the i-th row and j-th column.
As mentioned earlier, a deterministic blackbox PIT is equivalent to constructing a hitting-set. A set of points H ∈ F n is called a hitting-set for a class C of n-variate polynomials if for any nonzero polynomial P in C, there exists a point in H where P evaluates to a nonzero value.

Arithmetic Branching Programs
Definition 2.1 (Arithmetic Branching Program (ABP)). An ABP is a layered directed acyclic graph with q + 1 layers of vertices {V 0 ,V 1 , . . . ,V q } and a source a and a sink b such that all the edges of the graph only go from a to V 0 , V i−1 to V i for any i ∈ [q] and V q to b. The edges have univariate polynomials as their weights and as a convention, the edges going out of u and the edges going into t have constant weights, i.e. weights from the field F. The ABP is said to compute the polynomial It is well-known that the sum over all paths in a layered graph can be represented by an iterated matrix multiplication. To see this, let the set of nodes in V i be {v i, j | j ∈ [w]}. It is easy to see that the polynomial computed by the ABP is the same as

Read-once Oblivious ABP
An ABP is called a read-once oblivious ABP (ROABP) if the edge weights in different layers are univariate polynomials in distinct variables. Formally, there is a permutation π on the set [q] such that the entries in the ith matrix D i are univariate polynomials over the variable x π(i) , i.e. they come from the polynomial ring F[x π(i) ]. Here, q is the same as n, the number of variables. The order (x π(1) , x π (2) , . . . , x π(n) ) is said to be the variable order of the ROABP.
as a polynomial over the matrix algebra, we can write the polynomial computed by an ROABP as An equivalent representation of a width-w ROABP can be

Commutative ROABP
A polynomial f (x) is computed by a width-w commutative ROABP if, for every permutation σ of the variables, there exists a width-w ROABP in the variable order σ that computes the polynomial f (x). Note that the order of the variables becomes insignificant for a commutative ROABP.

Set-multilinear Circuits
where l i, j s are linear polynomials and x 1 , x 2 , . . . , x q form of a partition of the set of variables x. It is known that these circuits are subsumed by ROABPs [?]. However, they are incomparable to commutative ROABPs. That is, neither class of circuits is contained in the other. For example, the 2n-variate polynomial (x 1 + y 1 )(x 2 + y 2 ) · · · (x n + y n ) has a linear-size set-multilinear circuit. But, every ROABP in the variable sequence (x 1 , x 2 , . . . , x n , y 1 , y 2 , . . . , y n ) that computes it has width ≥ 2 n (follows from Nisan's characterization [?]). Thus, it is not computed by a commutative ROABP. In the other direction, commutative ROABPs can compute polynomials with individual degree ≥ 1, but set-multilinear circuits cannot. It is not known whether all multilinear polynomials computed by commutative ROABPs can be computed by polynomial-sized set-multilinear circuits.
A set-multilinear circuit has a corresponding polynomial over a commutative algebra. For the polynomial f (x) above, consider the polynomial over a k-dimensional algebra where D j = (l 1, j , l 2, j , . . . , l k, j ) and the algebra product is coordinate-wise product. It is easy to see that f = (1, 1, . . . , 1) · D. Note that the polynomials D i s are over a commutative algebra, that is, the order of the D i s in the product does not matter. Hence, some of our techniques for commutative ROABPs also work for set-multilinear circuits.
3 Hitting-set for Known-order ROABP

Bivariate ROABP
To construct a hitting-set for ROABPs, we start with the bivariate case. Recall that a bivariate ROABP is of the form To construct a hitting-set for this polynomial, we will use the notion of a partial derivative matrix, defined by Nisan [?] in the context of lower bounds. Let the individual degree of the . It is known that the rank of the matrix M f equals the smallest possible width of any ROABP computing f [?].
We will show that rank(M f r ) ≤ 1, for all r ∈ [w]. As f r = g r (x 1 )h r (x 2 ), its coefficients can be written as a product of coefficients from g r and h r , i.e., One can also show that if rank(M f ) = w then there exists a width-w ROABP computing f . We skip this proof as we will not need it. Now, using the above lemma we give a hitting-set for bivariate ROABPs.
be a nonzero bivariate polynomial over F with individual degree d. Then f (t w ,t w + t w−1 ) = 0.
Proof. Letf (t) be the polynomial after the substitution, i.e.,f (t) = f (t w ,t w +t w−1 ). Any monomial x i 1 x j 2 will be mapped to the polynomial t wi (t w + t w−1 ) j , under the mentioned substitution. The highest power of t coming from this polynomial is t w(i+ j) . We will cluster together all the monomials for which this highest power is the same, i.e., i + j is the same. The set of coefficients corresponding to any such cluster of monomials will form a diagonal in the matrix M f . The set {M f (i, j) | i + j = k} is defined to be the k-th diagonal of M f , for all 0 ≤ k ≤ 2d. Let be the largest number such that the -th diagonal has at least one nonzero element, i.e., = max{i + j | M f (i, j) = 0}.
As rank(M f ) ≤ w (from Lemma ??), we claim that the -th diagonal has at most w nonzero elements. To see this, let {(i 1 , j 1 ), (i 2 , j 2 ), . . . , (i w , j w )} be the set of indices where the -th diagonal of M f has nonzero elements, i.e., the set Now, we claim that there exists an r with w( − 1) < r ≤ w such that coeff (t r ) = 0. To see this, first observe that the highest power of t to which any monomial x i 1 x j 2 with i + j < can contribute is t w( −1) . Thus, for any w( − 1) < r ≤ w , the term t r can come only from the monomials x i 1 x j 2 with i + j ≥ . We can ignore the monomials Hence, for any 0 ≤ p < w, Here we assume that if p > j b , then j b p = 0. Writing the above equation in the matrix form, we get, . If all the columns of C are linearly independent, then clearly, coeff (t r ) = 0 for some w( − 1) < r ≤ w . We show the linear independence of the columns in Claim ??. To show this linear independence we need to assume that the numbers { j b } b are all distinct. Hence, we need the field characteristic to be zero or strictly greater than d, as j b can be as high as d for some b ∈ [w ]. Proof. We will show that for any nonzero vector α := (α 1 , α 2 , . . . , α w ) ∈ F 1×w , αC = 0. Consider the polynomial As h(y) is a nonzero polynomial with degree bounded by w − 1, it can have at most w − 1 roots. Thus, there exists an b ∈ [w ] such that h( j b ) = ∑ w a=1 α a j b a−1 = 0.
As mentioned above, the hitting-set proof works only when the field characteristic is zero or greater than d. We given an example over a small characteristic field, which demonstrates that the problem is not with the proof technique, but with the hitting-set itself. Let the field characteristic be 2. Consider the polynomial f (x 1 , x 2 ) = x 2 2 + x 2 1 + x 1 . Clearly, f has a width-2 ROABP. For a width-2 ROABP, the map in Lemma ?? would be (x 1 , x 2 ) → (t 2 ,t 2 + t). However, f (t 2 ,t 2 + t) = 0 (over F 2 ). Hence, the hitting-set does not work. Now, we move on to getting a hitting-set for an n-variate ROABP.

n-variate ROABP
Observe that the map given in Lemma ?? works irrespective of the degree of the polynomial, as long as the field characteristic is large enough. We plan to obtain a hitting-set for general n-variate ROABP by applying this map recursively. For this, we use the standard divide and conquer technique. First, we make pairs of consecutive variables in the ROABP. For each pair (x 2i−1 , x 2i ), we apply the map from Lemma ??, using a new variable t i . Thus, we go to n/2 variables from n variables. In Lemma ??, we use a hybrid argument to show that after this substitution the polynomial remains nonzero. Moreover, the new polynomial can be computed by a width-w ROABP. Thus, we can again use the same map on pairs of new variables. By repeating the halving procedure log n times we get a univariate polynomial. In each round the degree of the polynomial gets multiplied by w. Hence, after log n rounds, the degree of the univariate polynomial is bounded by w log n times the original degree. Without loss of generality, let us assume that n is a power of 2.
Proof. Let us apply the map in n/2 rounds, i.e., define a sequence of polynomials ( f = f 0 , f 1 , . . . , f n/2 = f (φ (x))) such that the polynomial f i is obtained by replacing (x 2i−1 , x 2i ) with (φ (x 2i−1 ), φ (x 2i )) in f i−1 for each 1 ≤ i ≤ n/2. We will show that for each 1 ≤ i ≤ n/2, if f i−1 = 0 then f i = 0. Clearly this proves the first part of the lemma. Note that f i−1 is a polynomial over variables {t 1 , . . . ,t i−1 , x 2i−1 , . . . , x n }. As f i−1 = 0, there exists a constant tuple α ∈ F n−i−1 such that after replacing the variables (t 1 , . . . ,t i−1 , x 2i+1 , . . . , x n ) with α, f i−1 remains nonzero. After this replacement we get a polynomialf i−1 in the variables (x 2i−1 , x 2i ). As f is computed by the ROABP D 1 D 2 · · · D n , the polynomialf i−1 can be written as In other words,f i−1 has a bivariate ROABP of width-w. Thus, is nothing but the polynomial obtained after replacing the variables (t 1 , . . . ,t i−1 , x 2i+1 , . . . , x n ) in f i with α. Thus, f i is nonzero. This finishes the proof. Now, we argue that f (φ (x)) has a width-w ROABP.
) for all 1 ≤ i ≤ n/2. Clearly,D 1D2 · · ·D n/2 is a width-w ROABP computing f (φ (x)) in variable order (t 1 ,t 2 , . . . ,t n/2 ), By applying the map φ in Lemma ??, we reduced an n-variate ROABP to an (n/2)-variate ROABP, while preserving the non-zeroness. The resulting ROABP has same width-w, but the individual degree goes up to become 2dw, where d is the original individual degree. As our map φ is degree insensitive, we can apply a similar map again on the variables for variables {s 1 , s 2 , . . . , s n/4 }. Now, we get an (n/4)-variate ROABP with individual degree 4dw 2 . It is easy to see that when the map φ is repeatedly applied in this way log n times, we get a nonzero univariate polynomial of degree ndw log n . Next lemma puts it formally. For ease of notation, we use the variable numbering from 0 to n − 1. Let p 0 (t) = t w and p 1 (t) = t w + t w−1 . φ (x i ) = p i 1 (p i 2 · · · (p i log n (t))), where i log n i log n−1 · · · i 1 is the binary representation of i.
Then f (φ (x)) is a nonzero univariate polynomial with degree ndw log n .
Note that the map φ crucially uses the knowledge of the variable order. In the last round when we are going from two variables to one, the individual degree is ndw log n−1 and Lemma ?? requires char(F) to be higher than the individual degree. Thus, having char(F) ≥ ndw log n suffices. Hence, we get the following theorem.
Theorem 3.6. Let C be the class of n-variate, individual degree d polynomials computed by width-w ROABPs. Then a hitting-set for C of size O(ndw log n ) can be constructed, when the variable order is known and the field characteristic is zero or at least ndw log n .
Proof. Let f (x) be a polynomial in class C. From Lemma ??, f (φ (x)) ∈ F[t] is a nonzero univariate polynomial with degree ndw log n . Thus, if we substitute 1 + ndw log n field values for the variable t, one of them will keep f (φ (x)) nonzero.
From this, we immediately get the following result for constant-width ROABPs. Note that when w is constant, the lower bound on the characteristic also becomes poly(n). Corollary 3.7. For the class of n-variate, individual degree d polynomials computed by constant width ROABPs (known variable order), a poly(n, d)-size hitting-set can be constructed, when the field characteristic is zero (or larger than poly(n, d)).
As mentioned earlier, our approach can potentially lead to a polynomial size hitting-set for ROABPs. We make the following conjecture for which we hope to get a proof on the lines of Lemma ??.

Commutative ROABP
In this section, we give better hitting-sets for commutative ROABPs. Recall that a polynomial f (x) ∈ F[x] is said to be computed by a width-w commutative ROABP if it can be computed by a width-w ROABP in every order. That is, for any permutation σ : [n] → [n], f (x) can be written as We will also consider ROABPs which compute a polynomial over the matrix algebra, that is, polynomials whose coefficients are matrices.
Forbes et al.
[?] gave a hitting-set of size d O(log w) (nw) O(log log w) for width-w, n-variate commutative ROABPs with individual degree bound d. Note that when d is small, this hitting-set size is much better than that for general ROABP, i.e., (ndw) O(log n) [?]. However when d is Ω(n), the size is comparable to the general case. We improve the hitting-set size for the commutative case to (ndw) O(log log w) . This is significantly better than the general case for all values of d.

Rank-concentration
Forbes et al. [?] constructed the hitting-set using the notion of rank-concentration defined by Agrawal et al. [?]. Recall that D(x) is a polynomial over an algebra if its coefficients come from the algebra.
Definition 4.1 ([?]). A polynomial D(x) over an algebra is said to be -concentrated if its coefficients of (< )-support monomials span all its coefficients. That is, for all a ∈ {0, 1, 2, . . . } n (4.1) Note that for a nonzero polynomial over a field, -concentration simply means that one of its monomials of support < has a nonzero coefficient. As we will see later, it is easy to construct hittingsets for a polynomial which has low-support concentration. However, not every polynomial has a low-support concentration, for example, consider the following polynomial over a field: f (x) = x 1 x 2 · · · x n . Agrawal et al. [?] observed that concentration can be achieved by a shift of variables, e. g., f (x + 1) = (x 1 + 1)(x 2 + 1) · · · (x n + 1) has 1-concentration. For a polynomial f (x), shift by a tuple s = (s 1 , s 2 , . . . , s n ) would mean f (x + s) = f (x 1 + s 1 , x 2 + s 2 , . . . , x n + s n ).
To achieve concentration, it is often useful to consider shifts which are polynomials. In particular, we will be considering shifts by bivariate polynomials, i.e., s(t 1 ,t 2 ) ∈ F[t 1 ,t 2 ] n . As ultimately we are interested in hitting-sets, the variables t 1 and t 2 can later be replaced by field values. The size of the hitting-set, in this case, will be multiplied by δ 2 , where δ is the maximum degree of any s i (t 1 ,t 2 ). Thus, for a bivariate shift s(t 1 ,t 2 ), its degree will be viewed as the complexity measure. Note that for a polynomial D(x) ∈ F w×w [x], the coefficient of a monomial x a in D(x + s(t 1 ,t 2 )) will be from F[t 1 ,t 2 ] w×w . So, when we talk of low-support concentration in D(x + s(t 1 ,t 2 )), the span in (??) is taken over the field F(t 1 ,t 2 ).
Forbes et al. [?] construct the hitting-set for commutative ROABPs in two steps. Let f (x) be an n-variate individual degree-d polynomial computed by a width-w commutative ROABP. Their first step is to construct a tuple s(t 1 ,t 2 ) of bivariate polynomials with degree poly(n)d O(log w) such that f (x + s) has O(log w)-concentration. We improve this step by constructing a new tuple s(t 1 ,t 2 ) with degree (ndw) O(log log w) , which has the same property.
We follow the second step of Forbes et al.
[?] as it is. It is easy to see that f (x + s) can also be computed by a width-w commutative ROABP (over the field F(t 1 ,t 2 )). They show that if a given commutative ROABP is -concentrated then there is a hitting-set for it of size (ndw) O(log ) . This implies a hitting-set H of size (ndw) O(log log w) for f (x + s). Clearly, the set {h + s | h ∈ H} is a hitting-set for f (x). One can obtain a hitting-set in F n by replacing t 1 and t 2 with sufficiently many field values. By Schwartz-Zippel-DeMillo-Lipton Lemma, it will suffice to take more than deg t 1 ,t 2 ( f (h + s)) = deg( f ) · deg(s) values. Thus, the final hitting-set size becomes deg(s) · (ndw) O(log log w) . With our improved bound on deg(s), we get a hitting-set of the desired size. Now, we elaborate the first step of Forbes et al. [?], i.e., the construction of the shift s(t 1 ,t 2 ). To achieve concentration they use the idea of Agrawal, Saha and Saxena [?], i.e., achieving concentration in small sub-ROABPs implies concentration in the given ROABP. For the sake of completeness, we rewrite the lemma using the terminology of this paper. We first clarify a notation which will be used often: for an n-tuple s and a polynomial D(x) which only depends variables (x i 1 , x i 2 , . . . , x i ), D(x + s) will denote D(x i 1 + s i 1 , x i 2 + s i 2 , . . . , x i + s i ).
Lemma 4.2 ([?, ?]). Let < n be any number. Let s be the n-tuple such that for any distinct i 1 , i 2 , . . . , i ∈ [n] and individual degree-d polynomial D(x) = D 1 (x i 1 )D 2 (x i 2 ) · · · D (x i ) over the matrix algebra F w×w , D(x + s) is -concentrated. Then for any individual degree-d polynomial f (x) ∈ F[x] computed by a width-w commutative ROABP, f (x + s) is -concentrated.
Proof. Let f (x) = f (x +s). Consider any monomial x a with support ≥ . We will show that its coefficient in f (x) is in the span of smaller support coefficients in f (x). Let S = {x i 1 , x i 2 , . . . , x i } be a set of variables contained in the support of monomial x a . Let S = {x i +1 , . . . , x i n } be the rest of the variables. Let us write x a = x b x c with Supp(b) = S and Supp(c) ⊆ S. Since, f (x) is computed by a commutative ROABP, it has an ROABP in the variable order (x i 1 , . . . , x i , x i +1 , . . . , x i n ). That is, (4.2) Note that we have Supp(b ) ⊆ S because each monomial in D (x) comes from set S. It is easy to see that for any monomial Thus, by left multiplying A T and right multiplying coef D (x c )B in (??), we get Note that supp(b ) + supp(c) < supp(b) + supp(c) = supp(a). So, we can write In other words, for any monomial x a with supp(a) ≥ , coef f (x a ) is in the span of coefficients of support smaller than supp(a). This would mean that, in fact, all coefficients of f (x) are in the span of coefficients with support < . Now, for some ≤ n, the goal is to construct an n-tuple s such that for any distinct i 1 , i 2 , . . . , i ∈ n, shifting by s ensures -concentration in any -variate ROABP of the form D( Note that Lemma ?? holds for any value of ≤ n. However, one cannot choose to be arbitrary small. The reason is that for an -variate polynomial over a k-dimensional algebra, one can hope to achieve -concentration only when ≥ log(k + 1). To see this, consider the polynomial D(x) = ∏ i=1 (1 + v i x i ) over the algebra of k × k diagonal matrices, with k = 2 . Here, 1 stands for the matrix diag (1, 1, . . . , 1).
Define v 1 = diag(α 1 , α 2 , . . . , α k ) for some distinct α i s. And define v i = v 2 i−1 1 for 2 ≤ i ≤ . It is not hard to see that the 2 coefficients of the polynomial D are {1, v 1 , v 2 1 , . . . , v 2 −1 1 }, which are linearly independent. Note that since shifting is an invertible operation, the 2 coefficients of D(x + s) will also be linearly independent for any s. But, there are only 2 − 1 monomials with support < . Hence, the coefficients of (< )-support monomials cannot span all the coefficients in D(x + s), for any shift s.
Note that the dimension of the algebra F w×w is bounded by w 2 . To reiterate the goal, given n and w, we fix = log(w 2 + 1) and we want to achieve -concentration in all polynomials computed by an ROABP of the form As now we are dealing with polynomials in a small number of variables, it should be easier to achieve the concentration.
Towards this goal, Forbes et al. [?] give a bit more general result. For any ≥ log(w 2 + 1), they construct a tuple s ∈ F[t 1 ,t 2 ] n of degree poly(n)d O( ) which has the following property: for any polynomial D(x) ∈ F w×w [x] which uses at most of the n variables and has individual degree bound d, D(x + s) has -concentration. Here, Forbes et al. [?] do not need that D(x) is computed by an ROABP. We, on the other hand, use the property that D(x) is computed by a width-w, -variate ROABP and reduce the degree of s(t 1 ,t 2 ) to (ndw) O(log ) . Our construction of s(t 1 ,t 2 ) comes from the basis isolating weight assignment for ROABPs from Agrawal et al. [?]. We use the fact that for any polynomial over a k-dimensional algebra, shift by a basis isolating map achieves log(k + 1)-concentration [?].

Basis Isolation
Let us first recall the definition of a basis isolating weight assignment. Let M denote the set of all monomials over the variable set x with individual degree ≤ d. Any function w : x → {0, 1, 2, . . . } can be naturally extended to the set of all monomials as follows: Note that if the variable x i is replaced with t w(x i ) for each i, then any monomial m just becomes t w(m) . Let A k denote a k-dimensional algebra.
Lemma 4.4 (Isolation to concentration). Let D(x) be a polynomial over a k-dimensional algebra. Let w be a basis isolating weight assignment for D(x). Then D(x + t w ) is -concentrated (over F(t)), where = log(k + 1) .
We now recall the construction complexity of a basis isolating weight assignment for ROABP from [?]. Here, we present a slightly modified version of their Lemma 8 (without proof), which easily follows from it.
Lemma 4.5. For any numbers , n, k and d, we can construct a family W of (knd) O(log ) integer weight assignments on variables {x 1 , x 2 , . . . , x n } with weights bounded by (knd) O(log ) which has the following property: Let D(x) be an individual degree-d polynomial over A k of the form D 1 (x i 1 )D 2 (x i 2 ) · · · D (x i ) for some distinct i 1 , i 2 , . . . , i ∈ [n]. Then one of the weight assignments in W is basis isolating for D(x).
Let W be the family constructed in Lemma ?? with k = w 2 and = log(w 2 + 1) . From Lemma ?? and Lemma ??, for any D( there exists a weight assignment w ∈ W such that D(x + t w ) is -concentrated (over F(t)). However, we want a single tuple s which works for every D(x). To get a single tuple, we combine the tuples in {t w } w∈W using the standard technique of Lagrange Interpolation (also used in [?, ?]). Let {α w } w∈W be distinct constants. Define Note that s(t 1 , α w ) = t w 1 . The following claim shows that if D(x + t w 1 ) is -concentrated for some w ∈ W, then D(x + s(t 1 ,t 2 )) is also -concentrated.
Proof. It is easy to see that for any tuple s, coefficients of D(x + s) are linear combinations of coefficients of D and vice versa (over an appropriate field). And since shifting is an invertible, it preserves the rank of all coefficients. That is, Let this rank be k. Let us represent each coefficient of D as a vector in F k . Then coefficients of D and D come from F[t 1 ] k and F[t 1 ,t 2 ] k , respectively. Let M = {x a ∈ M | supp(a) < }. Since D has -concentration, rank F(t 1 ) {coef D (x a ) | x a ∈ M } = k.
Hence, one can form an full rank matrix L(t 1 ) ∈ F[t 1 ] k×k which is given by for some x a 1 , x a 2 , . . . , x a k ∈ M . Define L (t 1 ,t 2 ) ∈ F[t 1 ,t 2 ] k×k to be the matrix From the definition of D and D , it is clear that L (t 1 , α w ) = L(t 1 ). Since det(L) = 0, we get that det(L ) = 0. Thus, rank F(t 1 ,t 2 ) {coef D (x a ) | x a ∈ M } ≥ k.
However, k is the rank of all coefficients of D . Hence, D has -concentration. Now, since s(t 1 ,t 2 ) has the desired property from Lemma ??, f (x + s(t 1 ,t 2 )) is -concentrated for any polynomial f (x) computed by a width-w ROABP. Recall that deg t 1 (s) is bounded by (ndw) O(log log w) from the construction in Lemma ??. The same bound also holds on deg t 2 (s) because |W| = (ndw) O(log log w) .
Lemma 4.7. Given n, d, w, one can compute a tuple s(t 1 ,t 2 ) ∈ F[t 1 ,t 2 ] n with degree (ndw) O(log log w) such that for any n-variate, individual degree-d polynomial f (x) ∈ F[x] computed by a width-w commutative ROABP, f (x + s(t 1 ,t 2 )) is O(log w)-concentrated.
As mentioned before, O(log w)-concentration in f (x + s) means that it has an O(log w)-support monomial with a nonzero coefficient. Lemma ?? gives a bivariate tuple s(t 1 ,t 2 ) for the shift. We argue that one can substitute field values for t 1 and t 2 such that any chosen nonzero coefficient in f (x+s) remains nonzero after the substitution. Note that any coefficient of f (x + s) is a polynomial in t 1 and t 2 with its degree being at most deg( f )·deg(s), which is (ndw) O(log log w) . Thus, by Schwartz-Zippel-DeMillo-Lipton Lemma, substituting (ndw) O(log log w) many field values for t 1 and t 2 suffices. Now, we move on to the second step of Forbes, Shpilka and Saptharishi [?]. They give an (ndw) O(log log w) -size hitting-set for an already O(log w)-concentrated commutative ROABP. They do this by reducing the PIT question to an O(log w)-variate ROABP [?, Lemma 7.6].
The map described in Conjecture ?? is a possible candidate for a polynomial size hitting-set for ROABPs and proving this conjecture would resolve two of the points above.
As mentioned earlier, we believe the ideas here may help in finding a better PRG for ROBPs. Studying such connections would in particular take us closer towards resolving a major open question of finding an O(log n)-seed-length PRG for constant width ROBPs.

Acknowledgement
We thank the anonymous reviewer for suggesting that our techniques might work for a more general (the current) definition of commutative ROABP. We are thankful to Hervé Fournier, Sumanta Ghosh, Ramprasad Saptharishi for helpful discussions on the same.