Pseudorandom Generators from Polarizing Random Walks

: We propose a new framework for constructing pseudorandom generators for n -variate Boolean functions. It is based on two new notions. First, we introduce fractional pseudorandom generators, which are pseudorandom distributions taking values in [ − 1 , 1 ] n . Next, we use a fractional pseudorandom generator as steps of a random walk in [ − 1 , 1 ] n that converges to {− 1 , 1 } n . We prove that this random walk converges fast (in time logarithmic in n ) due to polarization. As an application, we construct pseudorandom generators for Boolean functions with bounded Fourier tails. We use this to obtain a pseudorandom generator for functions with sensitivity s , whose seed length is polynomial in s . Other examples include functions computed by branching programs of various sorts or by bounded-depth circuits.


Introduction
Pseudorandom generators (PRG) are widely studied in complexity theory.There are several general frameworks used to construct PRGs.One is based on basic building blocks, such as small-bias generators [16,2], k-wise independence, or expander graphs [11].Another approach is based on the hardness vs. randomness paradigm, which was introduced by Nisan and Wigderson [18] and has been very influential.Many of the hardness results used in the latter framework are based on random restrictions, and the analysis of how they simplify the target class of functions.The number of papers in these lines of work is on the order of hundreds, so we do not even attempt to give a comprehensive survey of them all; instead we refer the reader to survey articles [7,24].
The purpose of this paper is to introduce a new framework for constructing PRGs based on polarizing random walks.We develop the theory in this paper and give a number of applications; perhaps the most notable one is a PRG for functions of sensitivity s whose seed length is polynomial in s.But, as this is a new framework, there are many questions that arise, both technical and conceptual, and we view this paper as mostly preliminary, with the hope that many more applications would follow.

PRGs and fractional PRGs
Let F be a class of Boolean functions f : {−1, 1} n → {−1, 1}.The standard definition of a PRG for F with error ε > 0, is a random variable X ∈ {−1, 1} n such that where U denotes a random variable with the uniform distribution in {−1, 1} n .We relax this definition by introducing a new object called a fractional PRG, defined in the next paragraph.
To prepare the notation for the definition, identify f with a real multilinear polynomial, namely its Fourier expansion.This extends f : {−1, 1} n → {−1, 1} to f : R n → R, although, we would only be interested in inputs from [−1, 1] n .Observe that if x ∈ [−1, 1] n then f (x) = E X [ f (X)] where X ∈ {−1, 1} n is a random variable sampled as follows: for every i ∈ [n] sample X i ∈ {−1, 1} independently with E[X i ] = x i .In particular, f on [−1, 1] n is bounded, namely f : The following is a key definition.
Moreover, X has seed length r if X = G(U) for some function One trivial construction of a fractional PRG is X ≡ 0 but this is not going to be useful for our purpose of constructing PRGs.To disallow such examples, we require each coordinate of X to be far from zero with some noticeable probability.Formally, X ∈ [−1, 1] n is called p-noticeable if E[X 2  i ] ≥ p for all i = 1, . . ., n.
A good example to keep in mind is the following.Let G : {−1, 1} r → {−1, 1} n be a (Boolean valued) function, and set X = pG(U), where U ∈ {−1, 1} r is uniform.Notice that X is p 2 -noticeable.
Fractional PRGs are easier to construct than standard PRGs, as they can take values in [−1, 1] n .For example, assume that f has Fourier tails bounded in L 1 .That is, there exist parameters a, b ≥ 1 for which We show (in Lemma 4.4) that if X ∈ {−1, 1} n is roughly (ε/a)-biased, then pX is a fractional PRG for f with p ≈ 1/b and error ε.The reason is that this choice of p controls all the Fourier coefficients of f with large Hamming weight, while X controls the ones with small weight.(In fact, to optimize parameters one can choose X to be almost k-wise independent; see Lemma 4.4 for details).In any case, note that pX is p 2 -noticeable as pX takes values in {−p, p} n .

Fractional PRG as steps in a random walk
Let X ∈ [−1, 1] n be a fractional PRG for f with error ε.That is, The goal is to construct a random variable , where the fractional PRG X provides a "small step" towards this approximation.If we can combine these small steps in a way that they converge fast to {−1, 1} n , then we would be done.To be a bit more precise, consider a random walk starting at 0 with the following properties: 1.The value of f at each step on average does not change by too much.

The random walk converges fast to
Observe that if we take X as the first step, then property 1 is satisfied for the first step.Considering later steps leads to the following question: Given a point y ∈ [−1, 1] n , can we find a random variable and such that A takes values closer to Boolean values?We show that this is indeed the case if we assume that X not only fools f , but also fools any possible restriction of f .To formalize this, let F be a family of n-variate Boolean functions f : {−1, 1} n → {−1, 1}.We say that F is closed under restrictions if for any f ∈ F, if we fix some inputs of f to constants {−1, 1}, then the new restricted function is still in F. Most natural families of Boolean functions studied satisfy this condition.Some examples are functions computed by small-depth circuits, functions computed by bounded width branching programs, and functions of low sensitivity.
We show that if X is a fractional PRG for such F, then it can be used to approximate f (y) for any n is a fractional PRG for F which is closed under restrictions, then for any f ∈ F and any y ∈ [−1, 1] n it holds that THEORY OF COMPUTING, Volume 15 (10), 2019, pp.1-26 Technically, we need to also assume that X is symmetric, which means that Pr[X = x] = Pr[X = −x] for all x.This is easy to achieve from any X which is not symmetric, for example by multiplying X with a uniform bit (thus, increasing its seed length by 1 bit).

Polarization and fast convergence
Our next goal is to show fast convergence of the random walk to {−1, 1} n .To that end, we need to analyze the following martingale: where X 1 , X 2 , . . .are independent copies of a fractional PRG.We show that for some t not too large, Y t is close to a point in {−1, 1} n .But why would that be true?This turns out to be the result of polarization in the random walk.It suffices to show this for every coordinate individually.
So, let Z 1 , Z 2 , . . .∈ [−1, 1] be independent random variables (which are the i-th coordinate of X 1 , X 2 , . . .for some fixed i), and define the following one-dimensional martingale: Claim 3.5 shows that if (i) Z i is symmetric, and (ii) E[Z 2 i ] ≥ p (which follows from our assumption that the fractional PRG is p-noticeable), then it holds that Setting δ = ε/n guarantees that with probability 1 − ε all the coordinates of Y t are ε/n close to {−1, 1}.Then a simple argument shows that rounding the coordinates gives a PRG with error O(ε), as desired.
We now state our main theorem.
Let G = sign(Y t ) ∈ {−1, 1} n obtained by taking the sign of the coordinates in Y t .Then G is a PRG for F with error (t + 1)ε.
Note that computing this PRG only involves basic operations such as addition and multiplication over the reals with bounded error.

PRG for functions with bounded Fourier tails
As mentioned above, the families of Boolean functions that are fooled by our PRG include ones that satisfy the following two properties: (i) being closed under restrictions; (ii) having bounded L 1 Fourier tails.Tal [22] showed that the latter condition follows from a widely studied condition, that of bounded L 2 Fourier tails.Thus, using existing bounds for L 2 Fourier tails, we get that our PRG fools several classes of Boolean functions.Below we list the results for error ε = O(1), and refer the reader to the corresponding claims for the details of the full range of parameters: 4. Bounded-depth circuits: if f is computed by AC 0 circuits of depth d and size poly(n), our PRG has seed length O(log 2d−1 n • log log n).This is quadratically worse than the best known PRG due to Tal [22].See Corollary 4.9 for details.
Other than the PRG for functions of low sensitivity, all the other PRGs are comparable to the best known tailored PRG.However, the main message is that they are all the same PRG.Our general theorem is the following.Theorem 1.3 (PRG for functions of bounded L 1 Fourier tail, informal version of Theorem 4.5).Let F be a family of n-variate Boolean functions closed under restrictions.Assume that there exist a, b ≥ 1 such that for every f ∈ F, Then, for any ε ≤ 1/poly(b log n) there exists an explicit PRG X ∈ {−1, 1} n which fools F with error ε > 0, whose seed length is O(log(n/ε)(log log n + log(a/ε))b 2 ).
We note again that by a result of Tal [22], Theorem 1.3 holds also if we instead assume a bound on the L 2 Fourier tails (which are more common), namely if we assume that for every f ∈ F it holds that THEORY OF COMPUTING, Volume 15 (10), 2019, pp.1-26

PRG for functions which simplify under random restriction
A major component in prior constructions of PRGs that are based on random restrictions is finding a much smaller set of "pseudorandom retrictions."Ajtai and Wigderson [1] proposed such a PRG for low-depth circuits.Much follow-up work has been based on this framework to build PRGs for various classes of functions including low-depth circuits, branching programs, low-sensitivity functions [23,8,20,6,10], and a major component of the analysis is proving that the derandomized random restrictions work.
Our framework for constructing PRGs directly applies to function families that simplify under random restrictions without the need to derandomize the restrictions.Let F be a family of functions f : {−1, 1} n → {−1, 1} which are extended multilinearly to [−1, 1] n .Fix a parameter 0 < p < 1 and define the p-averaged function of f , denoted where Pr[i ∈ A] = p independently for i ∈ [n], and define where x A ∈ {−1, 1} A is the restriction of the input x to the coordinates in A, and U ∈ {−1, 1} n is independently and uniformly chosen.The crucial observation (Claim 5.1) is that for every x ∈ {−1, 1} n it holds that Suppose now we have a standard PRG X for the class of p-averaged functions F p = { f p : f ∈ F}.Note a PRG for the p-random restriction of functions in F would do, as f p is a convex combination of p-random restrictions of f (namely, averaging over U).Then, using our observation above, this implies that X = pX is a fractional PRG for the class F. Now by using our framework of viewing this fractional PRG as a random walk step, one can derive a standard PRG for F using O(log(n/ε)/p 2 ) independent copies of X.

Fourier tails of low-degree F 2 polynomials
Viola [25] gave a construction of a pseudorandom generator which fools n-variate polynomials over F 2 .The construction is the XOR of d independent small-bias generators.We wonder whether our framework can be used to achieve similar bounds.In particular, we raise the following problem: does the class of low-degree polynomials over F 2 have bounded L 1 Fourier tails?It's trivially true for d = 1 and it can be shown to hold for d = 2.However, to the best of our knowledge nothing was known for d ≥ 3.
We show (see Theorem 6.1 for more details) that for any Boolean function f : {−1, 1} n → {−1, 1} computed by a F 2 -polynomial of degree at most d, the following L 1 Fourier tail bound holds: This bound however falls short of implying a PRG using our techniques, and we conjecture that the correct bound is c k d , for some constant c d = 2 O(d) .

PRGs with respect to arbitrary product distributions
We note the following interesting generalization of our results that is almost direct from our techniques.
Consider the problem of "fooling" a family of functions with respect to an arbitrary product distribution D on {−1, 1} n (the uniform distribution being a special case).More formally, given a distribution D on {−1, 1} n and a family of functions F, we say that a random variable X is a PRG for F (with respect to We show a way to fool functions with respect to arbitrary product distributions. Corollary 1.4.Let F be a family of n-variate Boolean functions which is closed under restrictions and let D be any product distribution on {−1, 1} n .Let X ∈ [−1, 1] n be a symmetric p-noticeable fractional PRG for F with error ε and seed length .Let t = O(log(n/ε)/p).Then there exists an explicit PRG for F with respect to D with error tε and seed length t .
Thus, we now start our random walk (defined by the fractional PRG) from the point α instead of from 0, and the convergence follows from polarization in exactly the same way.
Thus all our PRG results in fact generalize to PRGs with respect to arbitrary product distributions.To the best of our knowledge, we are not aware of any non-trivial PRGs against arbitrary product distributions for the classes of functions we study.We wonder if this notion of fooling arbitrary product distributions has interesting applications.

Related work
The line of research closest in spirit to our paper, and which motivated our result, is that of using random and pseudo-random restrictions to construct PRGs.A good example is due to Gopalan et al. [8] which uses pseudo-random restrictions to construct PRGs.Our framework can be seen as extending this, as we do not need to analyze pseudo-random restrictions; instead, we analyze fractional PRGs, where the restriction happens automatically from the fractional PRG structure, and no derandomization is necessary.
Another line of work is the use of random walks in combinatorial optimization, for example in the algorithmic versions of Spencer's theorem [3,13] and follow-up work.It would be interesting to see if polarization can be used to speed up random walks in combinatorial optimization as well.

Open problems
As we give a new framework for constructing PRGs, there are many open problems that arise, both conceptual and technical.

Early termination
Our analysis requires a random walk with t = O(log(n/ε)/p) steps, each coming from a p-noticeable fractional PRG.We believe that for some natural families of functions shorter random walks might also suffice, but we do not know how to show this.We discuss this further in Section 7.
THEORY OF COMPUTING, Volume 15 (10), 2019, pp.1-26 Open problem 1.5.Find conditions on classes of Boolean functions so that short random walks can be used to construct PRGs.In particular, are there nontrivial classes where the number of steps is independent of n? 1.9.2Less independence Our analysis of Theorem 2.5 currently requires to assume t independent copies of a fractional PRG X.It might be possible that these copies can be chosen in a less independent form, where the analysis still holds.
Open problem 1.6.Can the fractional PRGs X 1 , . . ., X t in Theorem 2.5 be chosen not independently, such that the conclusion still holds?Concrete examples to consider are k-wise independence for k t, or using an expander random walk.

More applications
Our current applications follow from the construction of a fractional PRG for functions with bounded Fourier tails.The fractional PRG itself follows from standard constructions in pseudo-randomness (almost k-wise independent) adapted to our scenario.It will be interesting to try and find other classes of Boolean functions for which different constructions of fractional PRG work.

Gadgets
We can view the random walk as a "gadget construction."Given independent p-noticeable fractional PRGs X 1 , . . ., X t ∈ [−1, 1] n , view them as the rows of a t × n matrix, and then apply a gadget g : [−1, 1] t → {−1, 1} to each column to obtain the outcome in {−1, 1} n .We show that the random walk gives such a gadget which converges for t = O(log(n/ε)/p).Many constructions of PRGs can be viewed in this framework, where typically X i ∈ {−1, 1} n .Ours is the first construction which allows X i to take non-Boolean values.It is interesting whether other gadgets can be used instead of the random walk gadget, and whether there are general properties of gadgets that would suffice.

Low-degree polynomials
As discussed above, we wonder if our techniques can be used to construct a PRG for low-degree F 2 polynomials.In particular, we ask if one could improve the bounds we obtain (see Theorem 6.1) on the L 1 Fourier tails of low-degree F 2 polynomials.

Paper organization
We describe the general framework in detail in Section 2. We prove Theorem 2.5 in Section 3. We describe applications in Section 4. Our framework also applies to function families that simplify under random restrictions.We describe this in Section 5. We prove L 1 Fourier tail bounds for low-degree F 2 polynomials in Section 6.We try to partially answer the question related to early termination of the random walk in Section 7.
2 General framework 2.1 Boolean functions ] be an n-variate Boolean function, identified with its multilinear extension, also known as its Fourier expansion.
A family F of n-variate Boolean functions is said to be closed under restrictions if for any f ∈ F and any function f : {−1, 1} n → {−1, 1} obtained from f by fixing some of its inputs to {−1, 1} it holds that also f ∈ F.

Pseudorandom generators
Let F be a family of n-variate Boolean functions.The following is the standard definition of a pseudorandom generator (PRG) for F, adapted to our notation.
We introduce the notion of a fractional PRG.It is the same as a PRG, except that the random variable is allowed to take values in [−1, 1] n , instead of only Boolean values.We assume that X has finite support.
Our main goal will be to "amplify" fractional PRGs for F in order to obtain PRGs for F. To that end, we need to enforce some non-triviality conditions on the fractional PRG.For example, X = 0 is a fractional PRG for any function.We require that for any coordinate i ∈ [n], the value of X i is far from zero with noticeable probability.Formally, we require a noticeable second moment.

Definition 2.3 (p-noticeable random variable). A random variable
THEORY OF COMPUTING, Volume 15 (10), 2019, pp.1-26 For technical reasons, we would also need X to be symmetric, which means that the distribution of −X is the same as the distribution of X.This is easy to achieve, for example by multiplying all elements of X with a uniformly chosen sign.

Polarizing random walks
The main idea is to view a fractional PRG as steps in a random walk in [−1, 1] n that converges to {−1, 1} n .To that end, we define a gadget that implements the random walk; and moreover, that allows for fast convergence.As we will see later, the fast convergence is an effect of polarization.
Definition 2.4 (Random walk gadget).For any t ≥ 1 define the random walk gadget g t : We extend the definition to act on bit-vectors.Define g n t : g n t (x 1 , . . ., x t ) = (g t (x 1,1 , . . ., x t,1 ), . . ., g t (x 1,n , . . ., x t,n )) .Equivalently, we can view g n t as follows: construct a t × n matrix whose rows are x 1 , . . ., x t ; and then apply g t to each column of the matrix to obtain a resulting vector in The following theorem shows how to "amplify" fractional PRGs using the random walk gadget to obtain a PRG.Below, for x ∈ [−1, 1] n we denote by sign(x) ∈ {−1, 1} n the Boolean vector obtained by taking the sign of each coordinate (the sign of 0 can be chosen arbitrarily).

Proof of Amplification Theorem
We prove Theorem 2.5 in this section.From here onwards, we fix a family F of n-variate Boolean functions which is closed under restrictions.The proof is based on the following two lemmas.The first lemma amplifies a p-noticeable fractional PRG to a (1 − q)-noticeable fractional PRG.The second lemma shows that setting q = ε/n, the latter fractional PRG can be rounded to a Boolean-valued PRG without incurring too much error.Lemma 3.1 (Amplification lemma).Let X 1 , . . ., X t ∈ [−1, 1] n be independent symmetric p-noticeable fractional PRGs for F with error ε.Define a random variable Y ∈ [−1, 1] n as Y := g n t (X 1 , . . ., X t ).Then Y is a (1 − q)-noticeable fractional PRG for F with error tε, where q = 2 −Ω(pt) .

Proof of Lemma 3.1
We prove Lemma 3.1 in this section.We need to prove two claims: that g n t (X 1 , . . ., X t ) is a fractional PRG for F with error εt, and that it is (1 − q)-noticeable.This is achieved in the following sequence of claims.
First we need some notations.For y ] n be a fractional PRG for F with error ε.Then for any f ∈ F and any y Proof.Consider a distribution over F ∈ F obtained from f by fixing the i-th input to sign(y i ) with probability |y i |, independently for each i.That is, where R(x) ∈ [−1, 1] n is a random variable obtained by sampling R 1 , . . ., R n independently where each R i is chosen as follows.Pick R i (x) = sign(y i ) with probability |y i | and with probability 1 − |y i | do as follows: pick R i (x) = 1 with probability (x i + 1)/2 and pick R i (x) = −1 otherwise.It's easy to check that E R (R(x)) = y + δ y • x.By multilinearity of f , and as R(x) is a product distribution, for all Setting x = X and averaging over X gives since F ∈ F with probability one and X is a fractional PRG for F with error ε.
Proof.The proof is by induction on t.The base case t = 1 follows by definition as g n 1 (X 1 ) = X 1 .For t > 1 we will show that from which the claim follows by the triangle inequality.In fact, we will show a stronger inequality: for any fixing of x 1 , . . ., The first inequality then follows by averaging over x 1 = X 1 , . . ., x t−1 = X t−1 .To see why this latter inequality holds, set y = g n t−1 (x 1 , . . ., x t−1 ).Then by definition, The claim now follows from Claim 3.3.
We have so far proved that g n t (X 1 , . . ., X t ) is a fractional PRG for F with slightly worse error.Although we do not need it, it is worth noting that it is symmetric since X 1 , . . ., X t are symmetric and −g n t (X 1 , . . ., X t ) = g n t (−X 1 , . . ., −X t ).To conclude, we show that it converges fast to a value close to {−1, 1} n .This is the effect of polarization.It will be enough to analyze this for one-dimensional random variables.
Then E[B 2 t ] ≥ 1 − q where q = 3 exp(−t p/16).Proof.Let C i := 1 − |B i | be the distance to {−1, 1} at step i.We show that C i converges to 0 exponentially fast.Observe that C i satisfies the following recursive definition: In either case one can verify that C i ∈ [0, 1] and that Now observe that C i−1 and A i • sign(B i−1 ) are independent.This is because B i−1 is symmetric (because A j 's are symmetric), and so |B i−1 | and sign(B i−1 ) are independent.So we can write, The Taylor expansion of In particular, all the coefficients except for the constant term are negative.As ).To provide a piece of intuition explaining the fast convergence of this random walk, notice that once C i becomes sufficiently small, it gets more and more difficult to increase the value of C i again.This could be best explained with an example.Suppose all A i 's take value in {−0.5, 0.5}.We start at B 0 = 0 and take a step, say A 1 = 0.5, and therefore B 1 = 0.5.Now observe that the length of the next step would be only (1 − |B 1 |)|A 2 | = 0.25.So even if A 2 = −0.5, we get B 2 = 0.25, which means we still need to take one more step to become less than 0. In other words, once we get close to the boundary {−1, 1}, the random walk converges faster as it gets more difficult to move away from the boundary.Corollary 3.6.Let X 1 , . . ., X t ∈ [−1, 1] n be independent symmetric p-noticeable random variables.Define Y = g n t (X 1 , . . ., X t ).Then Y is (1 − q)-noticeable for q = 3 exp(−t p/16).Proof.Apply Claim 3.5 to each coordinate of Y .Lemma 3.1 follows by combining Claim 3.4 and Theorem 3.6.

Proof of Lemma 3.2
We prove Lemma 3.2 in this section.Let x ∈ [−1, 1] n be a possible outcome of X.Let W := W (x) ∈ {−1, 1} n be a random variable, where W 1 , . . .,W n are independent and The last term can be bounded by the union bound, Setting x = X and averaging over X gives where the first inequality follows as X is a fractional PRG with error ε, and the second by the discussion above.
Several natural families of Boolean functions have bounded Fourier tails, such as: AC 0 circuits [12,15]; functions with bounded sensitivity [9,14]; and functions computed by branching programs of various forms [20,6].Our goal is to construct a universal PRG which fools any such function.We consider two variants: L 1 bounds and L 2 bounds.
Tal [22] showed that L 2 bounds imply . The reverse direction is false, as can be witnessed by the PARITY function.So, the class of functions with L 1 bounded Fourier tails is richer, and we focus on it.
In the following lemma, we construct a fractional PRG for this class, which we will then amplify to a PRG.We note that this lemma holds also for bounded functions, not just Boolean functions.The construction is based on a scaling of almost d-wise independent random variables, whose definition we now recall.Naor and Naor [16] gave an explicit construction of an ε-almost d-wise random variable Z ∈ {−1, 1} n with seed length O(log log n + d + log(1/ε)).We note that this seed length is optimal, up to the hidden constants.
1} n be a δ -almost d-wise independent random variable, and set X = β Z which takes values in {−β , β } n .We claim that X satisfies the requirements of the lemma.Claim (i) clearly holds, and claim (ii) holds by the Naor-Naor construction.We thus focus on proving that X fools F with error ε.
Fix f ∈ F and consider its Fourier expansion: We need to show that E[ f (X)] is close to f (0).Averaging over X gives where we used the choice of β = 1/2b.The claim follows as we set δ = ε/2a and 2 −d ≤ ε/2a.Applying Theorem 2.5 using the fractional PRG constructed in Lemma 4.4 gives the following PRG construction.Note that we still need to require that F is closed under restrictions.

Applications
We apply our PRG from Theorem 4.5 to several well studied classes of Boolean functions that are known to satisfy a Fourier tail bound.

Functions of bounded sensitivity
Let f : {−1, 1} n → {−1, 1} be a Boolean function.Its sensitivity at an input x ∈ {−1, 1} n is the number of neighbors x of x (that is, x and x differ at exactly one coordinate) such that f (x ) = f (x).The (max) sensitivity of f is s( f ) = max x s( f , x).The sensitivity conjecture speculates that functions of sensitivity s can be computed by decision trees of depth poly(s).A corollary would be that almost poly(s)-wise distributions fool functions of low sensitivity.So, one may ask to construct comparable PRGs for functions of low sensitivity.
This question was first considered by Hatami and Tal [10].They constructed a PRG with subexponential seed length exp(O( √ s)).Theorem 4.5 gives an improved construction that essentially matches the consequence of the sensitivity conjecture.Our PRG uses the recent bounds of Gopalan et al. [9] on the Fourier tail of functions of low sensitivity.Concretely, Gopalan et al. [9] show that if s( f ) = s then f ∈ L 1 (1,t) for t = O(s).It is straightforward to verify that a restriction can only decrease the sensitivity of the function, so that the class of functions of sensitivity at most s is closed under restrictions.A direct application of Theorem 4.5 gives a PRG with seed length O(s 2 log(n/ε)(log log(n) + log(1/ε))).
To get a somewhat improved bound, one can apply a result of Simon [21] that shows that if s( f ) = s then f depends on at most m = 2 O(s) many inputs.In this case, the analysis of Theorem 2.5 can be applied with m variables instead of n variables , so that we only need O(log m/ε) iterations.Note that the fractional PRG still requires a seed length which depends on the original n.We obtain: Corollary 4.6.For any n, s ≥ 1 and ε ≤ 1/ poly(s), there exists an explicit PRG which fools n-variate Boolean functions with sensitivity s with error ε, whose seed length is We note that the log log n term cannot be removed.Indeed, even if we restrict attention to functions which are XOR of at most 2 bits (for which s = 2) the seed length required is Ω(log log n + log(1/ε)).

Unordered branching programs
An oblivious read-once branching program (abbrv ROBP) B of width w is a non-uniform model of computation, that captures randomized algorithms with space log w.A branching program B maintains a state in the set {1, . . ., w} and reads the input bits in a known fixed order.At time step i = 1, . . ., n, B reads a bit and based on the time step, the read bit and the current state it transitions to a new state.Thus, B can be thought of as a layered directed graph, with w nodes in each layer, and two edges going out of each node to the immediately next layer, one labeled with a 1 and the other labeled with a −1.
Let B n (w) be the class of n-variate Boolean functions computed by read-once oblivious branching programs of width w, where the order of the inputs is arbitrary.A recent work of Chattopadhyay et al. [6] showed that these functions have L 1 bounded Fourier tails.Concretely, B n (w) ⊂ L n 1 (1,t) for t = O((log n) w ).They used this to construct a PRG with seed length O(log n) w−1 log 2 (n/ε) log log n.Using our PRG from Theorem 4.5 we get a comparable (although slightly worse) seed length.Note that B n (w) is closed under restrictions.

Permutation branching programs
A special case of read-once branching programs are permutation branching programs, where the transition function from level i to level i + 1 in the graph is a permutation for every choice of the input bit.We denote it by B n perm (w) ⊂ B n (w).Reingold et al. [20] showed that if a Boolean function is computed by a permutation branching program of width w, then it has L 2 bounded Fourier tails with parameter 2w 2 .Note that permutation branching programs are also closed under restrictions.Thus we obtain the following result: Corollary 4.8.Fix n, w ≥ 1 and ε ≤ 1/(poly(w, log n)).There is an explicit PRG which fools B n perm (w) with error ε > 0, whose seed length is O(log(n/ε)(log log n + log 1/ε)w 4 ).
The dependence on n in our PRG is better than the result of Reingold et al. [20], as they obtained seed length O(w 2 log(w) log(n) log(nw/ε) + w 4 log 2 (w/ε)).
Reingold et al. [20] actually show the Fourier tail bounds for a more general class of branching programs, called regular branching programs.However, these are not closed under restriction, and hence our PRG construction fails to work (the same problem occurs also in the construction from [20]).

Bounded-depth circuits
The class of bounded-depth Boolean circuits AC 0 has been widely studied.In particular, Linial, Mansour and Nisan [12] showed that it has bounded L 2 Fourier tails.Tal [22] obtained improved bounds.If f is an n-variate Boolean function computed by an AC 0 circuit of depth d and size s, then f ∈ L 2 (n,t) for t = 2 O(d) log d−1 s.Theorem 4.5 provides a new PRG for AC 0 which is comparable with the existing PRGs of Nisan [17], Braverman [4], Tal [22], and Trevisan and Xue [23].

PRG for functions which simplify under random restriction
where x A ∈ {−1, 1} A is the restriction of the input x to the coordinates in A, and U ∈ {−1, 1} n is independently and uniformly chosen.Proof.Let A,U be random variables as defined above.Define a random variable Y ∈ {−1, 1} n as follows: Overall, using Claim 5.1, we obtain the following corollary of Theorem 2.5.
Corollary 5.2.Suppose that we have a standard PRG X for the class of p-averaged functions F p = { f p : f ∈ F} with error ε.Then X = pX is a fractional PRG for the class F. Therefore, we get a standard PRG for F using t = O(log(n/ε)/p 2 ) independent copies of X that has error tε.
6 Spectral tail bounds for low-degree F 2 -polynomials In this section, we prove L 1 Fourier tail bounds for functions computed by low-degree polynomials on F 2 .However, our bounds fall short of implying PRGs for the class of low-degree F 2 polynomials in our framework.
Theorem 6.1.Let p : F n 2 → F 2 be a polynomial of degree d, and let f (x) = (−1) p(x) .Then We note that L 2 bounds do not hold for low-degree polynomials, as can be witnessed by taking a high-rank quadratic polynomial.We prove Theorem 6.1 in the remainder of this section.For convenience, with a slight abuse of notation, we use the basis f : {−1, 1} → {−1, 1}.We first introduce some notation to simplify the presentation.Let denote the weight of the level-k Fourier coefficients of a Boolean function f , and let be the maximum of W k over degree-d polynomials.Note that we do not make any assumption on the number of variables n.We prove the following lemma from which Theorem 6.1 follows relatively easily.Lemma 6.2.For any d, k ≥ 1, We first show that Theorem 6.1 follows easily from Lemma 6.2.
Proof of Theorem 6.1 given Lemma 6.2.The proof of Theorem 6.1 is by induction, first on d and then on k.The base case of d = 1 is straightforward, so assume d ≥ 2. By Lemma 6.2 we have Assume towards a contradiction that W (d, k) > (k2 3d ) k .Dividing by W (d, k) on both sides gives If k = 1 then we reach a contradiction as 2 3d−1 + 1 ≤ 2 3d .If k > 1 then as (k − 1)2 3d ≥ k2 3d−1 the first term gets canceled by the third term, and the second term is at most (k2 3d ) k .In either case, we reached a contradiction.
From now on we focus on proving Lemma 6.2.To that end, fix f computed by a polynomial of degree d which maximizes W k ( f ).We shorthand g(S) = | f (S)|.The following claims are used in the proof of Lemma 6.2.We first show how to prove Lemma 6.2 using the above claims.
Proof of Lemma 6.
where the last inequality follows by using the bounds from Claim 6.4 and Claim 6.5.
We now proceed to prove the missing claims.
Proof of Claim 6.
This follows as (Y, Z) and (X •Y, X • Z) are identically distributed.Now consider any fixing of Y = y and Z = z.Define the function h y,z ( is the multiplicative derivative of f in direction y • z.In particular, the degree of h y,z is at most d − 1, since taking derivatives always reduces the degree.More information about properties of derivatives can be found in [19].Thus we have The proof follows now by noting that the above bound holds for any choice of y and z, and then averaging over y = Y, z = Z.
Proof of Claim 6.5.We have, In this section we provide a partial answer for Open Problem 1.5, regarding early termination of the random walk.Let Y t ∈ [−1, 1] n be the location of the random walk at time t.We would like to guarantee that if Y t is close enough to sign(Y t ) then we can round Y t to sign(Y t ) without changing the value of f by too much.Therefore, given f : Observe that should such a W exist, then if at some step t we have Y t − sign(Y t ) ∞ ≤ ε/W , then we can terminate the random walk immediately and guarantee that f We show that such smoothness property holds for functions with bounded sensitivity.

Bounded sensitivity functions
We show that smoothness follows from a bound on the (maximum) sensitivity of a Boolean function.We first consider the case that α − β ∞ is very small.Observe that given such (a, c), where the last inequality uses the first case in the proof, as a ∈ {−1, 1} n .Now let us construct the joint random variable (a, c).Fix i ∈ [n] and suppose without loss of generality that α i ≥ 0. Note that by construction −α i ≤ γ i ≤ α i .First sample a i so that E[a i ] = α i .If a i = −1 then set c i = −1, otherwise set It's easy to check that this choice of (a, c) satisfies the required conditions, finishing the proof. POOYA

Definition 4 . 1 (
L 1 bounds).For a, b ≥ 1, we denote by L n 1 (a, b) the family of n-variate Boolean functions

Definition 4 . 2 (
L 2 bounds).For a, b ≥ 1, we denote by L n 2 (a, b) the family of n-variate Boolean functions

Definition 4 . 3 (
Almost d-wise independence).A random variable Z ∈ {−1, 1} n is ε-almost d-wise independent if, for any restriction of Z to d coordinates, the marginal distribution has statistical distance at most ε from the uniform distribution on {−1, 1} d .
Another generic application of our framework is constructing PRGs for classes that simplify under random restriction.Let F be a family of functions f :{−1, 1} n → {−1, 1} which are extended multilinearly to [−1, 1] n .Fix a parameter 0 < p < 1 and define the p-averaged function of f , denoted f p : {−1, 1} n → [−1, 1] as follows: sample A ⊆ [n] where Pr[i ∈ A] = p independently for i ∈ [n], and define

1 .
[20]tions of sensitivity s: seed length O(s 3 log log n).Permutation unordered read-once branching programs of width w: seed length O(w 4 log n • log log n).This improves the dependence on n quadratically compared to the previous best PRG due to Reingold et al.[20].See Corollary 4.8 for details.
HATAMI is a postdoctoral researcher at UT Austin hosted by David Zuckerman.He received his Ph.D. from the University of Chicago in 2015 under the supervision of Alexander Razborov and Madhur Tulsiani.He has also spent two years as a postdoctoral researcher at the Institute for Advanced Study at Princeton and DIMACS at Rutgers University.His research interests lie broadly in theoretical computer science, particularly the role of pseudorandomness and randomness in computational complexity.He spends most of his spare time climbing rocks and plastic.KAAVE HOSSEINI is a Ph.D. candidate at UC San Diego under the supervision of Shachar Lovett.He obtained his undergraduate degree in mathematics and computer science at Sharif University of Technoogy, Iran.His research interests lie in additive combinatorics and computational complexity.He is mostly interested in approximate algebraic structures, the structure vs. randomness dichotomy, and pseudorandomness.He likes world music and plays Djembe (a West African drum), and also thinks he is still young enough to be a good gymnast.SHACHAR LOVETT received his Ph.D. from the Weizmann Institute of Science in 2010 under the supervision of Omer Reingold and Ran Raz.He was a postdoctoral researcher at the Institute for Advanced Study until 2012.Since 2012, he has been a faculty at the University of California, San Diego.He is a recipient of an NSF CAREER award and a Sloan fellowship.His research is broadly in theoretical computer science and combinatorics.In particular: computational complexity, randomness and pseudo-randomness, algebraic constructions, coding theory, additive combinatorics and combinatorial aspects of high-dimensional geometry.He is happily married and has three kids.