The Need for Structure in Quantum Speedups

Is there a general theorem that tells us when we can hope for exponential speedups from quantum algorithms, and when we cannot? In this paper, we make two advances toward such a theorem, in the black-box model where most quantum algorithms operate. First, we show that for any problem that is invariant under permuting inputs and outputs (like the collision or the element distinctness problems), the quantum query complexity is at least the 7th root of the classical randomized query complexity. (An earlier version of this paper gave the 9th root.) This resolves a conjecture of Watrous from 2002. Second, inspired by recent work of O'Donnell et al. (2005) and Dinur et al. (2006), we conjecture that every bounded low-degree polynomial has a"highly influential"variable. Assuming this conjecture, we show that every T-query quantum algorithm can be simulated on most inputs by a poly(T)-query classical algorithm, and that one essentially cannot hope to prove P!=BQP relative to a random oracle.


Introduction
Perhaps the central lesson gleaned from fifteen years of quantum algorithms research is this: Quantum computers can offer superpolynomial speedups over classical computers, but only for certain "structured" problems.
The key question, of course, is what we mean by "structured." In the context of most existing quantum algorithms, "structured" basically means that we are trying to determine some global property of an extremely long sequence of numbers, assuming that the sequence satisfies some global regularity. As a canonical example, consider Period-Finding, the core of Shor's algorithms for factoring and discrete logarithm [22]. Here we are given black-box access to an exponentiallylong sequence of integers X = (x 1 , . . . , x N ); that is, we can compute x i for a given i. We are asked to find the period of X-that is, the smallest k > 0 such that x i = x i−k for all i > k-promised that X is indeed periodic, with period k ≪ N . The requirement of periodicity is crucial here: it is what lets us use the Quantum Fourier Transform to extract the information we want from a superposition of the form Let S ⊆ [M ] N be a collection of inputs, and let f : S → {0, 1} be a function that we are trying to compute. In this paper, we assume for simplicity that the range of f is {0, 1}; in other words, that we are trying to solve a decision problem. It will also be convenient to think of f as a function from [M ] N to {0, 1, * }, where * means 'undefined' (that is, that a given input X ∈ [M ] N is not in f 's domain S).
We will work in the well-studied decision-tree model. In this model, given an input X = (x 1 , . . . , x N ), an algorithm can at any time choose an i and receive x i . We count only the number of queries the algorithm makes to the x i 's, ignoring other computational steps. Then the deterministic query complexity of f , or D (f ), is the number of queries made by an optimal deterministic algorithm on a worst-case input X ∈ S. The (bounded-error) randomized query complexity R (f ) is the expected number of queries made by an optimal randomized algorithm that, for every X ∈ S, computes f (X) with probability at least 2/3. The (bounded-error) quantum query complexity Q (f ) is the same as R (f ), except that we allow quantum algorithms. Clearly Q (f ) ≤ R (f ) ≤ D (f ) ≤ N for all f . See Buhrman and de Wolf [11] for detailed definitions as well as a survey of these measures.
If S = [M ] N , then we say f is total, while if M = 2, then we say f is Boolean. The case of total f is relatively well-understood. Already in 1998, Beals et al. [6] showed the following: 1 A variant asks us to find an i such that xi = 1, under the mild promise that such an i exists. 2 Here we exclude BQP-complete problems, such as simulating quantum physics (the "original" application of quantum computers), approximating the Jones polynomial [4], and estimating a linear functional of the solution of a well-conditioned linear system [16]. Theorem 2 (Beals et al. [6]) D (f ) = O(Q (f ) 6 ) for all total Boolean functions f : {0, 1} N → {0, 1}.
Furthermore, it is easy to generalize Theorem 2 to show that D (f ) = O(Q (f ) 6 ) for all total functions f : [M ] N → {0, 1}, not necessarily Boolean. 3 In other words, for total functions, the quantum query complexity is always at least the 6 th root of the classical query complexity. The largest known gap between D (f ) and Q (f ) for a total function is quadratic, and is achieved by the OR function (because of Grover's algorithm).
On the other hand, as soon as we allow non-total functions, we can get enormous gaps. Aaronson [2] recently gave a Boolean function f : and Q (f ) = O (log N log log N ), follow easily from Simon's algorithm [23] and Shor's algorithm [22]. Intuitively, these functions f achieve such large separations by being highly structured: that is, their domain S includes only inputs that satisfy a stringent promise, such as encoding a periodic function, or (in the case of [2]) encoding two Boolean functions, one of which is correlated with the Fourier transform of the other one.
By contrast with these highly-structured problems, consider the collision problem: that of deciding whether a sequence of numbers x 1 , . . . , x N ∈ [M ] N is one-to-one (each number appears once) or two-to-one (each number appears twice). Let Col(X) = 1 if X is one-to-one and Col(X) = 2 if X is two-to-one, promised that one of these is the case. Then Col(X) is not a total function, since its definition involves a promise on X. Intuitively, however, the collision problem seems much less "structured" than Simon's and Shor's problems. One way to formalize this intuition is as follows. Call a partial function f : for all inputs X ∈ [M ] N and all permutations σ ∈ S N and τ ∈ S M . Then Col(X) is permutationinvariant: we can permute a one-to-one sequence and relabel its elements however we like, but it is still a one-to-one sequence, and likewise for a two-to-one sequence. Because of this symmetry, attempts to solve the collision problem using (for example) the Quantum Fourier Transform seem unlikely to succeed. And indeed, in 2002 Aaronson [1] proved that Q (Col) = Ω N 1/5 : that is, the quantum query complexity of the collision problem is at most polynomially better than its randomized query complexity of Θ( √ N ). The quantum lower bound was later improved to Ω N 1/3 by Aaronson and Shi [3], matching an upper bound of Brassard, Høyer, and Tapp [10].
Generalizing boldly from this example, John Watrous (personal communication) conjectured that the randomized and quantum query complexities are polynomially related for every permutationinvariant problem: Let us make two remarks about Conjecture 3. First, the conjecture talks about randomized versus quantum query complexity, since in this setting, it is easy to find functions f for which R (f ) and Q (f ) are both tiny but D (f ) is huge. As an example, consider the Deutsch-Jozsa problem: given a Boolean input (x 1 , . . . , x N ), decide whether the x i 's are all equal or whether half of them are 1 and the other half are 0, promised that one of these is the case.
Second, if M = 2 (that is, f is Boolean), then Conjecture 3 follows relatively easily from known results: indeed, we prove in Appendix 6 that R (f ) = O(Q (f ) 2 ) in that case. So the interesting case is when M ≫ 2, as it is for the collision problem.
Conjecture 3 provides one natural way to formalize the idea that classical and quantum query complexities should be polynomially related for all "unstructured" problems. A different way is provided by the following conjecture, which we were aware of since about 1999: Conjecture 4 (folklore) Let Q be a quantum algorithm that makes T queries to a Boolean input X = (x 1 , . . . , x N ), and let ε > 0. Then there exists a deterministic classical algorithm that makes poly (T, 1/ε, 1/δ) queries to the x i 's, and that approximates Q's acceptance probability to within an additive error ε on a 1 − δ fraction of inputs.
Loosely speaking, while Conjecture 3 said that there was no property of a symmetric oracle string that quantum algorithms can evaluate superpolynomially faster than classical ones, Conjecture 4 says that there is no such property of a random oracle string.
Conjecture 4 would imply a far-reaching generalization of the result of Beals et al. [6] that D (f ) = O(Q (f ) 6 ) for all total Boolean functions f . In particular, define the ε-approximate query complexity of a Boolean function f : {0, 1} N → {0, 1}, or D ε (f ), to be the minimum number of queries made by a deterministic algorithm that evaluates f on at least a 1 − ε fraction of inputs X. Likewise, let Q ε (f ) be the minimum number of queries made by a quantum algorithm that evaluates f on at least a 1 − ε fraction of inputs. Then Conjecture 4 implies that D ε (f ) and Q δ (f ), are polynomially related for all Boolean functions f and all ε > δ > 0. This would provide a quantum counterpart to a beautiful 2002 result of Smyth [24], who solved an old open problem of Steven Rudich by showing that D ε (f ) = O(C ε 3 /30 (f ) 2 /ε 3 ) for all ε > 0 (where C δ (f ) denotes the "δ-approximate certificate complexity" of f ).
More dramatically, if Conjecture 4 holds, then we basically cannot hope to prove P = BQP relative to a random oracle. This would answer a question raised by Fortnow and Rogers [14] in 1998, and would contrast sharply with the situation for non-random oracles: we have had oracles relative to which P = BQP, and indeed BQP ⊂ MA, since the work of Bernstein and Vazirani [8] in the early 1990s. More precisely, under some suitable complexity assumption (such as P = P #P ), we would get BQP A ⊂ AvgP A with probability 1 for a random oracle A. Here AvgP is the class of languages for which there exists a polynomial-time algorithm that solves a 1 − o (1) fraction of instances of size n. In other words, separating BQP from AvgP relative to a random oracle would be as hard as separating complexity classes in the unrelativized world. This would provide a quantum counterpart to a theorem of Impagliazzo and Rudich (credited in [17]), who used the powerful results of Kahn, Saks, and Smyth [17] to show that if P = NP, then NP A ∩ coNP A ⊂ ioAvgP A with probability 1 for a random oracle A. 4

Our Results
Our main contribution in this paper is to prove Watrous's conjecture, that randomized and quantum query complexities are polynomially related for every symmetric problem. 9 polylog Q (f )) for every partial function f : We conjecture that R (f ) and Q (f ) are polynomially related even for functions f satisfying one of the two symmetries: namely, f (x 1 , . . . , x N ) = f x σ(1) , . . . , x σ(N ) for all σ ∈ S N . We also conjecture that the exponent of 9 can be improved to 2: in other words, that Grover's algorithm once again provides the optimal separation between the quantum and classical models.
While the proof of Theorem 5 is somewhat involved, it can be entirely understood by those unfamiliar with quantum computing: the difficulties lie in getting the problem into a form where existing quantum lower bound technology can be applied to it. Let us stress that it was not at all obvious a priori that existing quantum lower bounds would suffice here; that they did came as a surprise to us.
We first define and analyze a simple randomized algorithm, which tries to compute f (X) for a given X = (x 1 , . . . , x N ) by estimating the multiplicity of each element x i . Next, by considering where this randomized algorithm breaks down, we show that one can identify a "hard core" within f : roughly speaking, two input types A * and B * , such that the difficulty of distinguishing A * from B * accounts for a polynomial fraction of the entire difficulty of computing f . The rest of the proof consists of lower-bounding the quantum query complexity of distinguishing A * from B * . We do so using a hybrid argument: we develop a "chopping procedure" that gradually deforms A * and B * to make them more similar to each other, creating We then show that, for every ℓ ∈ [L], distinguishing A ℓ from A ℓ−1 (or B ℓ from B ℓ−1 ) requires many quantum queries, either by a reduction from Midrijanis's quantum lower bound for the Set Equality problem [18] (which was a nontrivial extension of Aaronson and Shi's collision lower bound [3]), or else by an application of Ambainis's general quantum adversary theorem [5].
Doing this hybrid argument in the "obvious" way produces a bound of the form R (f ) ≤ Q (f ) O(1) polylog N , which fails to imply a polynomial relationship between R (f ) and Q (f ) when Q (f ) ≤ (log N ) o(1) . However, a more sophisticated hybrid argument gets rid of the polylog N factor.
Our second contribution is more exploratory, something we put forward in the hope of inspiring followup work. We study Conjecture 4, the one that stated that every T -query quantum algorithm can be simulated on most inputs using T O(1) classical queries. We relate this conjecture to a fundamental open problem in Fourier analysis and approximation theory. Given a real polynomial p : be the influence of the i th variable, where X i means X with the i th bit flipped. Then we conjecture that every bounded low-degree polynomial has a "highly influential" variable. More precisely: Then there exists an i such that We show the following: Theorem 7 Assume Conjecture 6. Then (i) Conjecture 4 holds.
The main evidence for Conjecture 6-besides the fact that all the Fourier analysis experts we asked were confident of it!-is that extremely similar statements have recently been proved. Firstly, O'Donnell, Saks, Schramm, and Servedio [19] proved an analogue of Conjecture 6 for decision trees, which are a special case of bounded real polynomials: Unfortunately, Theorem 8 does not directly imply anything about our problem, even though Beals et al. [6] showed that D (f ) and Q (f ) are polynomially related for all total Boolean functions f . The reason is that the acceptance probability of a quantum algorithm need not approximate a total Boolean function.
The second piece of evidence for Conjecture 6 comes from a powerful result of Dinur, Friedgut, Kindler, and O'Donnell [13], which implies our conjecture, except with (1) . Let us state the special case of their result that is relevant for us: Theorem 9 (Dinur et al. 2006 [13]) Let ε > 0, and let p : Even though Theorem 9 has an exponential rather than polynomial dependence on 1/d, we observe that it already has a nontrivial consequence for quantum computation. Namely, it implies that any T -query quantum algorithm can be simulated on most inputs using 2 O(T ) classical queries. 5 Recall that the gaps between classical and quantum query complexities can be superexponential (and even N Ω(1) versus O (1), as in the example of Aaronson [2]), so even an exponential upper bound is far from obvious.

Quantum Lower Bound for All Symmetric Problems
In this section we prove Theorem 5: that R (f ) = O(Q (f ) 9 polylog Q (f )) for all permutationsymmetric f .
We start with a simple observation that is essential to everything that follows. Since f is symmetric, we can group the inputs X = (x 1 , . . . , x N ) into equivalence classes that we call types.
Definition 10 Given an input X = (x 1 , . . . , x N ) ∈ [M ] N , the type of X is a list of positive integers A = (a 1 , . . . , a u ) such that a 1 ≥ · · · ≥ a u and a 1 + · · · + a u = N , with each a i recording the multiplicity of some integer in X. For convenience, we adopt the convention that a i = 0 for all i > u.
In other words, a type is just a partition (or Young diagram) that records the multiplicities of the input elements. For example, a one-to-one input has type a 1 = · · · = a N = 1, while a two-to-one input has type a 1 = · · · = a N/2 = 2. We write X ∈ A if X is of type A. Clearly f (X) depends only on the type of X. Furthermore, given a quantum query algorithm Q, we can assume without loss of generality that Pr [Q accepts X] depends only on the type of X-since we can "symmetrize" Q (that is, randomly permute X's inputs and outputs) prior to running Q.

Randomized Upper Bound
Then the first step is to give a classical randomized algorithm that estimates the κ j 's. This algorithm, S T , is an extremely straightforward sampling procedure (indeed, there is essentially nothing else that a randomized algorithm can do here). S T will make O T 1+c log T queries, where T is a parameter and c ∈ (0, 1] is a constant that we will choose later to optimize the final bound. Let z j be the number of occurrences of j in (x i 1 , . . . , x i U ) Output κ j := N U z j as the estimate for κ j We now analyze how well S T works.
where the second line follows from a Chernoff bound and the third from where the second line follows from a Chernoff bound and the third from κ j < N/T 1−c .
Third, suppose κ j < N/T 5 . Then for all sufficiently large T , where the second line follows from κ j < N/T 5 , the third from the union bound, the fourth from κ j < N/T 5 (again), and the fifth from U ≤ 24T 2 ln T . Notice that there are at most T 5 values of j such that κ j ≥ N/T 5 . So putting all three cases together, Now call A a 1-type if f (X) = 1 for all X ∈ A, or a 0-type if f (X) = 0 for all X ∈ A. Consider the following randomized algorithm R T to compute f (X): Run S T to find an estimate κ i for each κ i Sort the κ i 's in descending order, so that κ 1 ≥ · · · ≥ κ M If there exists a 1-type A = (a 1 , a 2 for all i, then output f (X) = 1 Otherwise output f (X) = 0 Clearly R T makes O T 1+c log T queries, just as S T does. We now give a sufficient condition for R T to succeed.

Lemma 12
Suppose that for all 1-types A = (a 1 , a 2 , . . .) and 0-types B = (b 1 , b 2 , . . .), there exists an i such that Then R T computes f with bounded probability of error, and hence R (f ) = O T 1+c log T .
Proof. First suppose X ∈ A where A = (a 1 , a 2 , . . .) is a 1-type. Then by Lemma 11, with T c for all i (it is easy to see that sorting the κ i 's can only decrease the maximum difference). Provided this occurs, R T finds some 1-type close to ( κ 1 , κ 2 , . . .) (possibly A itself) and outputs f (X) = 1. Second Provided this occurs, by the triangle inequality, for every 1-type A = (a 1 , a 2 , . . .) there exists an i such that Hence R T does not find a 1-type close to ( κ 1 , κ 2 , . . .), and it outputs f (X) = 0.
In particular, suppose we keep decreasing T until there exists a 1-type A * = (a 1 , a 2 , . . .) and a 0-type B * = (b 1 , b 2 , . . .) such that for all i, stopping as soon as that happens. Then Lemma 12 implies that we will still have R (f ) = O T 1+c log T . For the rest of the proof, we will fix that "almost as small as possible" value of T for which (1) holds, as well as the 1-type A * and the 0-type B * that R T "just barely distinguishes" from one another.

The Chopping Procedure
Given two sets of inputs A and B with A ∩ B = ∅, let Q (A, B) be the minimum number of queries made by any quantum algorithm that accepts every X ∈ A with probability at least 2/3, and accepts every Y ∈ B with probability at most 1/3. Also, let Q ε (A, B) be the minimum number of queries made by any quantum algorithm that accepts every X ∈ A with at least some probability p, and that accepts every Y ∈ B with probability at most p − ε. Then we have the following basic relation: for all A, B and all ε > 0.
Proof. This follows from standard amplitude estimation techniques (see Brassard et al. [9] for example). The rest of the proof is going to consist of lower-bounding Q (A * , B * ), the quantum query complexity of distinguishing inputs of type A * from inputs of type B * . We do this via a hybrid argument. Let L := ⌈log 2 N ⌉. At a high level, we will construct two sequences of types, A 0 , . . . , A L and B 0 , . . . , B L , such that Provided we can do this, it is not hard to see that we get the desired lower bound on Q (A * , B * ). For suppose a quantum algorithm distinguishes A 0 = A * from B 0 = B * with constant bias. Then by the triangle inequality, it must also distinguish some A ℓ from A ℓ+1 , or some B ℓ from B ℓ+1 , with reasonably large bias (say Ω (1/ log N )). And by Proposition 13, any quantum algorithm that succeeds with bias ε can be amplified, with O (1/ε) overhead, to an algorithm that succeeds with constant bias.
We now describe the procedure for creating the intermediate types A ℓ and B ℓ . Intuitively, we want to form A ℓ from A ℓ−1 , and B ℓ from B ℓ−1 , by "chopping the rows" of their respective Young diagrams, whenever a row of A ℓ sticks out further than the corresponding row of B ℓ or vice versa. This way, we can gradually make A ℓ and B ℓ more similar to one another. To describe how this works, it will be convenient to relax the notion of a type slightly. Let a row-array be a list of 2N nonnegative integers (a 1 , . . . , a 2N ), not necessarily sorted, such that a 1 + · · · + a 2N = N . Note that every type (a 1 , . . . , a u ) is also a row-array, if we adopt the convention that a u+1 = · · · = a 2N = 0. Also, every row-array (a 1 , . . . , a 2N ) can be converted to a type A = type (a 1 , . . . , a 2N ) in a unique way, by simply sorting the a i 's in descending order.
let P be the first power of 2 greater than or equal to N for ℓ := 1 to L for every i ∈ [2N ] such that a i − b i ≥ P/2 ℓ and a i > P/2 ℓ set a i := a i − P/2 ℓ find a j ∈ [2N ] such that a j = b j = 0, and set a j := P/2 ℓ set A ℓ := type (a 1 , . . . , a 2N ) next ℓ The procedure to produce B 1 , B 2 , . . . is exactly the same, except with the roles of a and b reversed. The procedure is illustrated pictorially in Figure 1.
We start with some simple observations. First, by construction, this procedure halts after L = O (log N ) iterations. Second, within a given iteration ℓ, no row i is ever chopped more than once-for if it could be, then it would have been chopped in a previous iteration. (This is just saying that the integer a i − b i can be written uniquely as a sum of powers of 2.) Third, initially Define Then it is not hard to see that A 1 , A 2 , . . . and B 1 , B 2 , . . . both evolve toward the same final configuration: namely, A * − B * singleton rows, together with one row of length min {a i , b i } for each i such that min {a i , b i } > 0. We therefore have the key fact that A L = B L . Notice that A ℓ − A ℓ−1 = rP/2 ℓ , where r is number of rows that get chopped in the ℓ th iteration. We now prove an upper bound on A ℓ − A ℓ−1 when ℓ is small, which will be useful later.
Proof. Let C be the set of rows that are chopped in going from A ℓ−1 to A ℓ ; then each row i ∈ C decreases in length by P/2 ℓ . It follows that, if we let j = j (i) be the "ancestral row" in A * = (a 1 , a 2 , . . .) that i came from, we must have Since ℓ ≤ (log 2 T ) − 2, the left inequality implies which combined with the right inequality yields Now let R := i∈C j (i) be the set of all ancestral rows. Then

Quantum Lower Bounds
Recall that we listed four properties that we needed the chopping procedure to satisfy. We have already seen that it satisfies properties (i)-(iii), so the remaining step is to show that it satisfies property (iv). That is, we need to lower-bound Q (A ℓ , A ℓ−1 ), the bounded-error quantum query complexity of distinguishing inputs of type A ℓ from inputs of type A ℓ−1 . (Lower-bounding Q (B ℓ , B ℓ−1 ) is exactly analogous.) To do this, it will be convenient to consider two cases: first, that forming A ℓ involved chopping few elements of A ℓ−1 , and second, that it involved chopping many elements. We will show that we "win either way," by a different quantum lower bound in each case. First consider the case that few elements were chopped. Here we prove a lower bound using Ambainis's quantum adversary method [5], in its "general" form (the one used, for example, to lower-bound the quantum query complexity of inverting a permutation). For completeness, we now state Ambainis's adversary theorem in the form we will need.
Theorem 15 (Ambainis [5]) Let A, B ⊆ [M ] N be two sets of inputs with A ∩ B = ∅. Let R ⊆ A × B be a relation on input pairs, such that for every X ∈ A there exists at least one Y ∈ B with (X, Y ) ∈ R and vice versa. Given inputs X = (x 1 , . . . , x N ) in A and Y = (y 1 , . . . , y N ) in B, let Suppose that q X,i q y,i ≤ α for every (X, Y ) ∈ R and every i ∈ [N ] such that Using Theorem 15, we can prove the following lower bound on Q (A ℓ , A ℓ−1 ).
Proof. Let A ℓ−1 = (a 1 , a 2 , . . .), and let i (1) , . . . , i (r) be the r rows in A ℓ−1 that get chopped. Recall that in going from A ℓ−1 to A ℓ , we chop each row i (j) into a row of length a i(j) − P/2 ℓ and another row of length P/2 ℓ , so that d = rP/2 ℓ . Now, given inputs X = (x 1 , . . . , x N ) in A ℓ−1 and Y = (y 1 , . . . , y N ) in A ℓ , we set (X, Y ) ∈ R if and only if one can transform X to Y in the following way: In this example, N = 11, r = 2, P/2 ℓ = 2, and a 1 = a 2 = 3. So we transform X to Y by choosing h 1 = 1 and h 2 = 2, changing any two elements equal to h 1 and any two elements equal to h 2 , and then swapping the four elements that we changed with four unchanged elements. (2) For each j ∈ [r], change exactly P/2 ℓ of the x i 's that are equal to h j to something else.
(3) Swap the d elements of X that were changed in step (2) with any other d elements of X.
The procedure is illustrated pictorially in Figure 2. Note that we can reverse the procedure in a natural way to go from Y back to X: satisfying y i = h j .
(2) For each j ∈ [r], change all of the y i 's that are equal to h j to something else.
(3) Swap the d elements of Y that were changed in step (2) with any other d elements of Y .
Fix any (X, Y ) ∈ R, and let i ∈ [N ] be any index such that x i = y i . Then applying Theorem 15, we claim that either q X, To see this, observe that either x i is one of the "other d elements" in step (3) of the X → Y conversion, in which case or else y i is one of the "other d elements" in step (3) of the Y → X conversion, in which case Hence So by Theorem 15, We now consider the case that many elements are chopped. Here we prove a lower bound by reduction from Set Equality. Given two sequences of integers Y ∈ [M ] N and Z ∈ [M ] N , neither with any repeats, the Set Equality problem is to decide whether Y and Z are equal as sets or disjoint as sets, promised that one of these is the case. Set Equality is similar to the collision problem studied by Aaronson and Shi [3], but it lacks permutation symmetry, making it harder to prove a lower bound by the polynomial method. By combining the collision lower bound with Ambainis's adversary method, Midrijanis [18] was nevertheless able to show the following: Theorem 17 (Midrijanis [18]) Q (Set Equality) = Ω (N/ log N ) 1/5 .
We now use Theorem 17 to prove another lower bound on Q (A ℓ , A ℓ−1 ).

Lemma 18
Suppose A ℓ was formed from A ℓ−1 by chopping r rows. Then Proof. We will show how to embed a Set Equality instance of size r into the A ℓ versus A ℓ−1 problem.
Let A ℓ−1 = (a 1 , . . . , a u ). Also, let i (1) , . . . , i (r) ∈ [u] be the r rows that are chopped in going from A ℓ−1 to A ℓ , and let j (1) , . . . , j (u − r) ∈ [u] be the u − r rows that are not chopped. Recall that, in going from A ℓ−1 to A ℓ , we chop each row i (k) into a row of length a i(k) − P/2 ℓ and another row of length P/2 ℓ . Now let Y = (y 1 , . . . , y r ) and Z = (z 1 , . . . , z r ) be an instance of Set Equality. Then we construct an input X ∈ [M ] N as follows. First, for each k ∈ [r], set a i(k) − P/2 ℓ of the x i 's equal to y k , and set P/2 ℓ of the x i 's equal to z k . Next, let w 1 , w 2 , . . . ∈ [M ] be a list of numbers that are guaranteed not to be in Y ∪ Z. Then for each k ∈ [u − r], set a j(k) of the x i 's equal to w k .
It is easy to see that, if Y and Z are equal as sets, then X will have type A ℓ−1 , while if Y and Z are disjoint as sets, then X will have type A ℓ . So in deciding whether X belongs to A ℓ or A ℓ−1 , we also decide whether Y and Z are equal or disjoint. The lemma now follows from Theorem 17.

Putting Everything Together
Let C be a quantum query algorithm that distinguishes A 0 = A * from B 0 = B * , and assume C is optimal: that is, it makes Q (A * , B * ) ≤ Q (f ) queries. As mentioned earlier, we can assume that Pr [C accepts X] depends only on the type of X. So let Then by assumption, |p 0 − q 0 | ≥ 1/3. Since p L = q L , this implies that either |p 0 − p L | ≥ 1/6 or |q 0 − q L | ≥ 1/6. Assume the former without loss of generality. Now let β ℓ := 1 10ℓ 2 , and observe that ∞ ℓ=1 β ℓ < 1 6 . By the triangle inequality, it follows that there exists an ℓ ∈ [L] such that |p ℓ − p ℓ−1 | ≥ β ℓ . In other words, we get a Q (f )-query quantum algorithm that distinguishes A ℓ from A ℓ−1 with bias β ℓ . By Proposition 13, this immediately implies Now let d = A ℓ − A ℓ−1 , and suppose A ℓ was produced from A ℓ−1 by chopping r rows. Then d = rP/2 ℓ ≤ 2rN/2 ℓ . So combining Lemmas 16 and 18, we find that since the minimum occurs when r ≈ 2 5ℓ/7 ℓ 2/7 . If ℓ ≤ (log 2 T ) − 2, then combining Lemmas 16 and 14, we also have the lower bound Hence Let us now make the choice c = 2/7, so that we get a lower bound of This completes the proof of Theorem 5.

Quantum Lower Bounds Under The Uniform Distribution
In this section, we consider the problems of P ? = BQP relative to a random oracle, and of simulating a T -query quantum algorithm on most inputs using T O(1) classical queries. We show that these problems are connected to a fundamental conjecture about influences in low-degree polynomials.
Let p : {0, 1} N → [0, 1] be a real polynomial. Given a string X ∈ {0, 1} N , let X i denote X with the i th bit flipped. The following notions will play important roles in this section: the L 1 -variance Vr [p] of p, the influence Inf i [p] of the i th variable x i , the total influence SumInf [p], and the L 2 -norm p 2 2 .
Recall Conjecture 6, which stated that bounded polynomials have influential variables: that is, for every degree-d polynomial p : 1) . Assuming Conjecture 6, we will derive a number of consequences for quantum complexity theory.
To do so, we first need a lemma that slightly generalizes a result of Shi [21].

Lemma 19
Suppose a quantum algorithm makes T queries, and accepts the input X ∈ {0, 1} N with probability p (X). Then SumInf [p] = O (T ).
Proof. Let |ψ X be the final state of the quantum algorithm if the input is X, and let Then Lemma 4.3 of Shi [21] implies that E ≤ 2T /N . Hence We also need the following lemma of Beals et al. [6]. Lemma 20 ([6]) Suppose a quantum algorithm Q makes T queries to a Boolean input X ∈ {0, 1} N . Then Q's acceptance probability is a real multilinear polynomial p (X), of degree at most 2T .
We now prove our first consequence of Conjecture 6: namely, that it implies the folklore Conjecture 4.
Theorem 21 Suppose Conjecture 6 holds, and let ε, δ > 0. Then given any quantum algorithm Q that makes T queries to a Boolean input X, there exists a deterministic classical algorithm that makes poly (T, 1/ε, 1/δ) queries, and that approximates Q's acceptance probability to within an additive constant ε on a 1 − δ fraction of inputs.
Proof. Let p (X) be the probability that Q accepts input X = x 1 . . . x N . Then Lemma 20 says that p is a real polynomial of degree at most 2T . Assume Conjecture 6: that for every such p, there exists a variable i satisfying Inf i [p] ≥ q (Vr [p] /T ), for some fixed polynomial q. Under that assumption, we give a classical algorithm C that makes poly (T, 1/ε, 1/δ) queries to the x i 's, and that approximates p (X) on most inputs X. set p 0 := p for j := 0, 1, 2, . . .: if  Proof. Let Q be a quantum algorithm that evaluates f (X), with bounded error, on a 1−ε fraction of inputs X ∈ {0, 1} N . Let p (X) := Pr [Q accepts X]. Now run the classical simulation algorithm C from Theorem 21, to obtain an estimate p (X) of p (X) such that Output f (X) = 1 if p (X) ≥ 1 2 and f (X) = 0 otherwise. By the theorem, this requires poly (T, 1/δ) queries to X, and by the union bound it successfully computes f (X) on at least a 1 − ε − δ fraction of inputs X.
We also get the following complexity consequence: Theorem 23 Suppose Conjecture 6 holds. Then P = P #P implies BQP A ⊂ AvgP A with probability 1 for a random oracle A.
Proof. Let Q be a polynomial-time quantum Turing machine that queries an oracle A, and assume Q decides some language L ∈ BQP A with bounded error. Given an input x ∈ {0, 1} n , let p x (A) := Pr Q A (x) accepts . Then clearly p x (A) depends only on some finite prefix B of A, of size N = 2 poly(n) . Furthermore, Lemma 20 implies that p x is a polynomial in the bits of B, of degree at most poly (n). Assume Conjecture 6 as well as P = P #P . Then we claim that there exists a deterministic polynomial-time algorithm C such that for all Q and x ∈ {0, 1} n , where p x (A) is the output of C given input x and oracle A. This C is essentially just the algorithm from Theorem 21. The key point is that we can implement C using not only poly (n) queries to A, but also poly (n) computation steps.
To prove the claim, let M be any of the 2 poly(n) monomials in the polynomial p j from Theorem 21, and let α M be the coefficient of M . Then notice that α M can be computed to poly (n) bits of precision in P #P , by the same techniques used to show BQP ⊆ P #P [8]. Therefore the expectation can be computed in P #P as well. The other two quantities that arise in the algorithm-Vr [p j ] and Inf i [p j ]-can be computed in the second level of the counting hierarchy CH = P #P ∪ P #P #P ∪ · · · (since they involve an exponential sum inside of an absolute value sign, and another exponential sum outside of it). This means that finding an i such that Inf i [p j ] > q ε 2 /T is in the third level of the counting hierarchy. But under the assumption P = P #P , the entire counting hierarchy collapses to P. Therefore all of the computations needed to implement C take polynomial time. Now let δ n (A) be the fraction of inputs x ∈ {0, 1} n such that | p x (A) − p x (A)| > 1 10 . Then by (2) together with Markov's inequality, Since ∞ n=1 1 n 2 converges, it follows that δ n (A) ≤ 1 n for all but finitely many values of n, with probability 1 over A. Assuming this occurs, we can simply hardwire the behavior of Q on the remaining n's into our classical simulation procedure C. Hence L ∈ AvgP A .
Since the number of BQP A languages is countable, the above implies that L ∈ AvgP A for every L ∈ BQP A simultaneously (that is, BQP A ⊂ AvgP A ) with probability 1 over A.
As a side note, suppose we had an extremely strong variant of Conjecture 6, one that implied something like in place of (2). Then we could eliminate the need for AvgP in Theorem 23, and show that P = P #P implies P A = BQP A with probability 1 for a random oracle A.
We conclude this section with some unconditional results. These results will use Theorem 9 of Dinur et al. [13]: that for every degree-d polynomial p : R N → R such that 0 ≤ p (X) ≤ 1 for all X ∈ {0, 1} N , there exists a polynomial p depending on at most 2 O(d) /ε 2 variables such that p − p 2 2 ≤ ε. Theorem 9 has the following simple corollary.
Corollary 24 Suppose a quantum algorithm Q makes T queries to a Boolean input X ∈ {0, 1} N . Then for all α, δ > 0, we can approximate Q's acceptance probability to within an additive constant α, on a 1 − δ fraction of inputs, by making 2 O(T ) α 4 δ 4 deterministic classical queries to X. (Indeed, the classical queries are nonadaptive.) Proof. Let p (X) := Pr [Q accepts X]. Then p is a degree-2T real polynomial by Lemma 20. So by Theorem 9, there exists a polynomial p, depending on By Cauchy-Schwarz, then, E so by Markov's inequality Pr Thus, our algorithm is simply to query x i 1 , . . . , x i K , and then output p (X) as our estimate for p (X). Likewise: for all Boolean functions f and all ε, δ > 0.
Proof. Set α to any constant less than 1/6, then use the algorithm of Corollary 24 to simulate the ε-approximate quantum algorithm for f . Output f (X) = 1 if p (X) ≥ 1 2 and f (X) = 0 otherwise.
Given an oracle A, let BQP A[log] be the class of languages decidable by a BQP machine able to make O (log n) queries to A. Also, let AvgP A || be the class of languages decidable, with probability 1 − o (1) over x ∈ {0, 1} n , by a P machine able to make poly (n) parallel (nonadaptive) queries to A. Then we get the following unconditional variant of Theorem 23.
Theorem 26 Suppose P = P #P . Then BQP A[log] ⊂ AvgP A || with probability 1 for a random oracle A.
Proof. The proof is essentially the same as that of Theorem 23, except that we use Corollary 24 in place of Conjecture 6. In the proof of Corollary 24, observe that the condition as well, where p µ (X) equals the mean of p (Y ) over all inputs Y that agree with X on x i 1 , . . . , x i K . Thus, given a quantum algorithm that makes T queries to an oracle string, the computational problem that we need to solve boils down to finding a subset of the oracle bits x i 1 , . . . , x i K such that K = 2 O(T ) α 4 δ 4 and (3) holds. Just like in Theorem 23, this problem is solvable in the counting hierarchy CH = P #P ∪ P #P #P ∪ · · · . So if we assume P = P #P , it is also solvable in P.
In Theorem 23, the conclusion we got was BQP A ⊂ AvgP A with probability 1 for a random oracle A. In our case, the number of classical queries K is exponential (rather than polynomial) in the number of quantum queries T , so we only get BQP A[log] ⊂ AvgP A . On the other hand, since the classical queries are nonadaptive, we can strengthen the conclusion to BQP A[log] ⊂ AvgP A || .

Open Problems
It would be nice to improve the R (f ) = O(Q (f ) 9 polylog Q (f )) bound for all symmetric problems. As mentioned earlier, we conjecture that the right answer is R (f ) = O(Q (f ) 2 ). Note that if one could tighten Midrijanis's quantum lower bound for Set Equality [18] from Ω((N/ log N ) 1/5 ) to Ω(N 1/3 ), then an improvement to R (f ) = O(Q (f ) 7 polylog Q (f )) would follow immediately. However, it seems better to avoid using Set Equality altogether. After all, it is a curious feature of our proof that, to get a lower bound for all symmetric problems, we need to reduce from the non-symmetric Set Equality problem, which in turn is lower-bounded by a reduction from the symmetric collision problem! We also conjecture that R (f ) ≤ Q (f ) O(1) for all partial functions f that are symmetric only under permuting the inputs (and not necessarily the outputs). Proving this seems to require a new approach.
It would be interesting to reprove the R (f ) ≤ Q (f ) O(1) bound using only the polynomial method, and not the adversary method. Or, to rephrase this as a purely classical question: for all Then is it the case that R (f ) ≤ deg (f ) O(1) for all permutation-invariant functions f ?
On the random oracle side, the obvious problem is to prove Conjecture 6-thereby establishing that D ε (f ) and Q δ (f ) are polynomially related, and all the other consequences shown in Section 3. Alternatively, one could look for some technique that was tailored to polynomials p that arise as the acceptance probabilities of quantum algorithms. In this way, one could conceivably solve D ε (f ) versus Q δ (f ) and the other quantum problems, without settling the general conjecture about bounded polynomials.

Acknowledgments
We thank Andy Drucker, Ryan O'Donnell, and Ronald de Wolf for helpful discussions.