The influence lower bound via query elimination

We give a simpler proof, via query elimination, of a result due to O'Donnell, Saks, Schramm and Servedio, which shows a lower bound on the zero-error randomized query complexity of a function f in terms of the maximum influence of any variable of f. Our lower bound also applies to the two-sided error distributional query complexity of f, and it allows an immediate extension which can be used to prove stronger lower bounds for some functions.


Introduction
Query complexity measures the hardness of computing a function f by the minimum number of input variables one needs to read before knowing the function value. A k-query ǫ-error randomized query algorithm is one that, on all inputs, has at most ǫ error probability and makes at most k queries over all random coins. The ǫ-error randomized query complexity of f , denoted R ε (f ), is the minimum number k such that there exists a k-query ǫ-error randomized query algorithm. The influence of a variable is another important quantity which measures the importance of the variable to the function value (on average over other variables). More precisely, for a function f : X n → Z and a distribution µ on X , the influence of the i-th variable is defined as inf i (f, µ) = Pr[f (X) = f (X i )], where X = X 1 . . . X n is drawn from µ ⊗n and X i is obtained from X by re-randomizing Both query complexity and influence are well-studied subjects; see [BdW02] for a survey of the former (with many other complexity measures) and [O'D08] for a survey of the latter (and Fourier analysis on Boolean functions).
Randomized query complexity can be lower bounded in terms of influence. In [OSSS05], O'Donnell, Saks, Schramm and Servedio proved that for all Boolean functions f : . (1) Above µ p is the distribution on {−1, +1} with −1 picked with probability p; Var[f ] is the variance of f (X) with X drawn from µ ⊗n p and R 0 (f ) represents the zero-error randomized query complexity of f ; namely the minimum over all randomized query algorithms with no error on each input, of the maximum expected (over the random coins) number of queries made by the query algorithm on any input. Recently Lee [Lee10] gave another proof of this fact. Together with another bound R 0 (f ) ≥ ( i inf i (f, µ p )) 2 /(4p(1 − p)) for monotone functions [OS07], it gives a lower bound of Ω(n 2/3 ) for all monotone functions invariant to a transitive group of permutations (on variables). This in particular reproduces the Ω(n 4/3 ) lower bound for all monotone graph properties in [Haj91], which is O(log 1/3 (n)) shy of record [CK01].
In this paper we give a new proof of Eq. (1), arguably shorter and simpler than both previous ones [OSSS05,Lee10]. In fact we prove a stronger statement that applies to the two-sided error case. The basic idea is by query elimination: we can save one query without increasing the error by more than max i inf i (f, p), and eventually eliminate all queries to obtain a zero-query algorithm, which must have a large error probability on a hard distribution. This lower bounds the number of queries of the original algorithm. The analysis for the increase in error due to eliminating one query is quite simple and follows from the union bound (applied just once) and the observation that X i is identically distributed to X.
Since we lower bound distributional query complexity (defined in the next section), we get a smoothed version of the influence bound as an immediate consequence. As in the cases with the rectangle bound and the discrepancy bound in communication complexity and query complexity, where the smoothed versions can prove strong lower bounds [Kla07, She08, SZ09, LZ10, Kla10, JK10, CR11], this smoothed influence lower bound also gives stronger bounds for some functions than Eq. (1).

Main result and proof
Definition 1 (Influence) Let f : X n → Z be a function, and X i 's and Y i 's (for i = 1, ..., n) be random variables i.i.d. distributed according to µ on X . For each i ∈ [n], let X i represent the random variable X 1 . . .
The maximum influence of f with respect to µ is defined as For ε > 0, a deterministic k-query algorithm has the λ-distributional error ǫ if it makes at most k queries over all possible inputs, and for a random input drawn from λ, the average error probability is ǫ. The ǫ-error λ-distributional query complexity of f , denoted D λ ε (f ), is the minimum number k such that there exists a k-query algorithm which has the µ-distributional error ǫ. We show the following.
Theorem 1 Let f : X n → Z be a function, µ be a distribution on X and ε > 0. Let X be drawn from µ ⊗n . Then, Proof: Let P k be a deterministic k-query algorithm for f with µ ⊗n -distributional error at most δ. We present a deterministic (k − 1)-query algorithm P k−1 for f with µ ⊗n -distributional error at most δ + inf max (f, µ). This way, starting from an algorithm which makes D µ ⊗n ε (f ) queries and has average error at most ε, repeating the above procedure gives another algorithm P 0 which makes no queries and has average error at most ε + D µ ⊗n ε (f ) · inf max (f, µ). It is easily seen that P 0 must have error at least 1 − max z∈Z Pr[f (X) = z] and hence we get the desired result. Now we show how to obtain P k−1 from P k . We will show a randomized algorithm P ′ k−1 with at most k − 1 queries on any input and any random coins and average error under µ ⊗n at most δ + inf max (f, µ). From P ′ k−1 , using an easy averaging argument (and fixing coins of P ′ k−1 appropriately), we can get a deterministic algorithm P k−1 with at most k − 1 queries on any input and the same average error bound as in P ′ k−1 . Let X i be the first query of P k and without loss of generality we can assume that P k does not query X i any more afterward. In P ′ k−1 we do not make this query, but assume the answer to this query to be Y i , where Y i is distributed according to µ and is independent of X. From here on P ′ k−1 proceeds identically to P k . By construction the maximum number of queries made by P ′ k−1 is at most k − 1. Let ans(P, X) represent the answer of algorithm P on input X. Since ans(P k , X i ) = f (X) implies either ans(P k , It is easily argued that R 0 (f ) = Ω(R ε (f )) = Ω(D µ ⊗n ε (f )) for ε, µ as above. Also 1 − max z Pr[f (X) = z] = Ω(Var[f ]) for Boolean functions f , therefore the above theorem implies Eq. (1).
Next we improve the lower bound by going to a function g, which is close to f but could potentially have smaller inf max . Let g : X n → Z be a function such that Pr[f (X) = g(X)] ≤ δ, where X is drawn from µ ⊗n as above and δ ≥ 0. It is easily noted that an algorithm for f with average error under µ ⊗n being at most ε also works as an algorithm for g with average error under µ ⊗n being at most ε + δ. Therefore D µ ⊗n ε (f ) ≥ D µ ⊗n ε+δ (g). Hence as a corollary of Theorem 1 we get that a smoothed version of the influence bound also applies as a lower bound on the distributional query complexity of f .
Corollary 2 Let f : X n → Z be a function, µ be a distribution on X and ε > 0, δ ≥ 0. Let X be drawn from µ ⊗n . Let g : X n → Z be a function such that Pr[f (X) = g(X)] ≤ δ. Then Note that there are functions f with large inf max but close to some other function g with small inf max . For example, Tribes is OR of s ≈ n/ log 2 n AND gates, each of degree t ≈ log 2 n−log 2 log 2 n. The parameters s, t are so set to make exactly half the inputs being 1. It is well known that for this function, all influences inf i = Θ(log n/n), where the distribution is uniform on all inputs. Let g be Tribes, and obtain f from g by picking a δ-fraction of inputs x and changing their function values to f (x) = x 1 (x 1 is the first bit of x) and for the rest f (x) = g(x). Then the first variable has influence at least Ω(δ), so applying the old bound only gives a constant lower bound. But g is δ-close to f with inf max (g) = Θ(log n/n). So the above corollary gives a much better lower bound of Θ(n/ log n) for the distributional query complexity of f , which the original bound Eq. (1) only gives a constant.
A final comment is that our proof does not assume that the distributions of the different variables are the same. The proof goes through and the bound applies analogously as long as these distributions are independent.