Fourier Sparsity and Dimension

: We prove that the Fourier dimension of any Boolean function with Fourier sparsity s is at most O ( √ s log s ) . This bound is tight up to a factor of O ( log s ) since the Fourier dimension and sparsity of the address function are quadratically related. We obtain our result by bounding the non-adaptive parity decision tree complexity, which is known to be equivalent to the Fourier dimension. A consequence of our result is that any XOR function has a protocol of complexity O ( √ r log r ) in the simultaneous communication model, where r is the rank of its communication matrix


Introduction
The study of Boolean functions involves studying various complexity measures and their inter-relationships.Two such measures, which we investigate in this article, are the Fourier dimension and the Fourier sparsity.Let f : F n 2 → {1, −1} be a Boolean function with Fourier expansion where χ γ (x) := (−1) ∑ n i=1 γ i x i are the characters and the f (γ)'s are real numbers, called the Fourier coefficients of f .The Fourier dimension and Fourier sparsity are defined as follows.
A conference version of this paper appeared in the Proceedings of the 42nd International Colloquium on Automata, Languages, and Programming (ICALP 2015) [12].SWAGATO SANYAL Definition 1.1 (Fourier sparsity and dimension).For a Boolean function f : F n 2 → {1, −1} with Fourier expansion the Fourier support of f , denoted by supp( f ), is defined as The Fourier sparsity of f , denoted by sparsity( f ), is defined as the size of the Fourier support of f , i. e., sparsity( f while the Fourier dimension dim( f ) of f is defined as the dimension of the span of supp( f ).
Note that the Fourier expansion of f is a multilinear polynomial in the variables y i := (−1) x i .With respect to this view, Fourier sparsity of a Boolean function f is the number of monomials that appear in the Fourier expansion of f .Fourier dimension, on the other hand, is the smallest number of monomials, or equivalently parity functions, whose values always determine f .It is natural to investigate the power and limitation of polynomials with low Fourier sparsity or Fourier dimension.Fourier sparsity and Fourier dimension were studied by Gopalan et al. [4] in the context of property testing.More recently these quantities have been studied in the context of learning [7,1].An approximate analog of Fourier dimension has been shown to characterize the randomized one-way communication complexity of an important and well-studied subclass of functions called the XOR functions, over uniformly distributed inputs [8].A Boolean function F(x, y) on two n-bit inputs is an XOR function if there exists a Boolean function f on n bits such that F(x, y) = f (x ⊕ y).
Besides, the study on Fourier sparsity has attracted attention of complexity theorists due to its intimate connection to the log-rank conjecture of communication complexity.Fourier dimension is related to a simultaneous communication game.This connection is discussed in more detail later in this section.
The following inequalities easily follow from the definition of Fourier sparsity and dimension.
There are functions (e.g., indicator functions of subspaces) for which the first inequality is tight (i.e., holds with equality).In this note we examine the tightness of the second inequality.Note that the Fourier transform of a Boolean function is a polynomial with a very special property; it evaluates to 1 or −1 on all inputs from {1, −1} n .It is thus natural to expect that there is always a good amount of dependency amongst its monomials.The gap between Fourier sparsity and Fourier dimension is one way of quantifying this dependency.
For the second inequality, the address function is one function having asymptotically the closest gap between Fourier dimension and sparsity. 1Let s be a power of 2. The address function Add s : 1 Recently Khalyavin, Lobanov and Tarannikov have constructed a function for which the gap between Fourier dimension and Fourier sparsity is closer by a constant factor than in the address function [9].THEORY OF COMPUTING, Volume 15 (11), 2019, pp.1-13 where x is interpreted as an address in {1, . . ., √ s}.In other words, on every input (x, y), Add s (x, y) is the value of the addressed input bit y x indexed by the addressing variables x.The address function has Fourier sparsity s and Fourier dimension at least √ s.To see this, note that Each indicator function 1 x= x has Fourier sparsity2 √ s.Since the summation is over √ s terms, and since there is no cancellation due to the presence of a character corresponding to a fresh variable y x in each term, the Fourier sparsity of the address function is equal to To see that the dimension is at least √ s, note that for each x ∈ {0, 1} (1/2) log s , the character (−1) y x appears in the Fourier transform.We prove that for any Boolean function, this is the asymptotically highest possible value of dim( f ) in terms of sparsity( f ), up to a factor of O(log s).This was presented as an open problem at the Simons workshop on Real Analysis in Testing, Learning and Inapproximability, 2013 by John Wright.
Our main result is the following.
Theorem 1.2.Let f be a Boolean function with sparsity( f In a preliminary version of this article, posted on arXiv and ECCC, we proved an upper bound of O(s 2/3 ) on dim( f ).Avishay Tal observed that the analysis can be tightened to obtain the near-optimal upper bound of O ( √ s log s).Prior to this work, Gavinsky et al. [3] had proved that the dimension of any Boolean function with Fourier sparsity s is O(s/ log s).
Theorem 1.2 is proved using a lemma of Tsang et al. [14] bounding the codimension of an affine subspace restricted to which the function is constant, in terms of the Fourier sparsity of the function.The following result is a corollary to [14,Lemma 28].
Lemma 1.3 (Tsang, Wong, Xie, and Zhang).Let f : F n 2 → {1, −1} be a Boolean function with Fourier sparsity s.Then there is an affine subspace V of F n 2 of codimension O( √ s) such that f is constant on V .
Proof idea of Theorem 1.2.We begin with a simple but crucial observation made by Gopalan et al. [4], that the Fourier dimension of a Boolean function is equivalent to its non-adaptive parity decision tree complexity (see Proposition 2.7).This offers us a potential approach towards an upper bound on the Fourier dimension of a Boolean function: exhibiting a shallow non-adaptive parity decision tree of the function.We recall that the character functions essentially compute the parities of various subsets of the input bits.Thus a parity decision tree can be thought of as querying various character functions.Parity functions, in turn, are linear functions from F n 2 to F 2 ; thus affine subspaces can be described by specifying a set of parities that are set to various values.
Towards this end, we first recall the construction of an (adaptive) parity decision tree for a Boolean function f of Fourier sparsity s by Tsang et al. [14], which in turn improves on an earlier construction due to Shpilka et al. [13,Theorem 1.1].The broad idea of their construction is as follows.At any point in time, a partial tree is maintained whose leaves are functions which are restrictions of f on different affine subspaces.Then a non-constant leaf τ is picked arbitrarily, and a small set of linear restrictions is obtained by invoking Lemma 1.3, such that the restricted function f | τ at that leaf becomes constant.The next step is observing that if f | τ is further restricted to all the affine subspaces obtained by setting the same set of parities in all possible ways, then the Fourier sparsity of each of the corresponding restricted functions is at most half of that of f | τ .This is because, in the former restriction, since the function becomes constant, the Fourier coefficients corresponding to non-constant characters must disappear in the restricted space.This can only happen if every non-constant character gets identified with at least one other character.This identification leads to halving of the Fourier support.Note that by Lemma 1.3 the number of queries we have spent to achieve this reduction in Fourier sparsity is A calculation gives us that proceeding in this way a parity decision tree of depth O( √ s) is obtained.Note that the choice of parities restricted at various steps depends on the leaf (function) chosen, and hence on the outcomes of the preceding queries.Thus the constructed tree is an adaptive one.In this article, we make this tree non-adaptive, at the cost of a logarithmic increase in depth.At each step, we choose an appropriate function (leaf) of the partial tree constructed thus far, invoke Lemma 1.3, and obtain restrictions which make it constant.Then we query the same set of parities at every leaf.Note that doing that in each step results in a non-adaptive tree, as the set of parities queried at each step is the same for all the leaves of the current partial tree, and hence independent of the responses to the previous queries.Then we argue that this leads to a significant reduction of Fourier sparsity.Let s (i) be the Fourier sparsity of the function (leaf) chosen at the i-th step.It can be shown that, in the next step, the size (i) of the union of the Fourier supports of all the leaves falls at least by s (i) /2.From Lemma 1.3, the number of queries spent in the i-th step is O( √ s (i) ).Using the Uncertainty Principle (Theorem 2.4) we show that s (i) ≥ (i) 2 /s (see Lemma 3.3).We combine all these facts to show that continuing in this fashion, in a small number of steps and making at most O( √ s log s) queries, all the leaves become constant functions.The details of the construction of the non-adaptive parity decision tree, and its analysis, are given in Section 3.
Connections to communication complexity and the log-rank conjecture.The log-rank conjecture is a long-standing and important conjecture in communication complexity.The statement of the conjecture is that the deterministic communication complexity of a Boolean function is asymptotically bounded above by some fixed polylogarithm of the rank (over the real numbers) of its communication matrix.The best known upper bound of deterministic communication complexity of a function in terms of the rank is O( √ rank log rank) due to Lovett [10]. 3The rank of the communication matrix of an XOR function F(x, y) := f (x ⊕ y) is known to be equal to the Fourier sparsity s of f .For the special case of XOR functions the result by Lovett also follows from a result of of Tsang et al. [14], 4 which improves on a result by Shpilka et al. [13].A consequence of Theorem 1.2 is that XOR functions admit a protocol of complexity O( √ rank log rank) = O( √ s log s) in the simultaneous communication model. 5We note that both the earlier protocols [10,14] are two-way.The simultaneous protocol is as follows.Alice and Bob a priori agree on a set S of O( √ s log s) parities that span the Fourier support of F(x, y) = f (x ⊕ y).The existence of S is guaranteed by Theorem 1.2.The character corresponding to each parity P in S is a product of a character M x in variables in x and a character M y in variables in y.Upon receiving inputs x and y, Alice and Bob compute the values of M x and M y , respectively, for each P ∈ S, and send the evaluations to the referee.The referee then can evaluate the characters corresponding to each parity P ∈ S on the input x ⊕ y.Since every other character in the Fourier expansion of F is determined by the characters in S, the referee can compute the value of all those characters.Finally, the referee evaluates F(x, y) from its Fourier expansion and outputs it.Some remarks about Lemma 1.3.Lemma 1.3 is not believed to be tight.Tsang et al. [14] investigated this question while studying the log-rank conjecture for XOR functions.They suggested a direction towards proving the log-rank conjecture for XOR functions.In particular, they proposed a protocol for an XOR function F(x, y) = f (x ⊕ y) based on a parity decision tree of f and showed that the communication complexity of the proposed protocol is polylogarithmic in the rank of the communication matrix if the following related conjecture, stated as [14, Conjecture 27], is true.

Conjecture 1.4 (Tsang et al.)
. There exists a constant c > 0 such that for every Boolean function f with Fourier sparsity s, there exists an affine subspace of codimension O (log c s) on which f is constant.
Even before, Montanaro and Osborne [11] had conjectured that the parity decision tree complexity of a function is polylogarithmic in its Fourier sparsity.Tsang et al. showed that this seemingly stronger conjecture follows from Conjecture 1.4.
Tsang et al. proved the above conjecture for certain classes of functions, which include functions with constant F 2 degree, and proved Lemma 1.3 for general functions.It follows from subsequent result by Hatami, Hosseini and Lovett [6] that Conjecture 1.4 is equivalent to the log-rank conjecture for XOR functions (and also to the conjecture by Montanaro and Osborne).
We remark that the bound of Theorem 1.2 is near-optimal; so any significant improvement to it assuming Conjecture 1.4 would be a contradiction, and hence would serve to refute Conjecture 1.4.We note that with our proof technique and analysis, any improvement to Lemma 1.3 (in particular a positive resolution of Conjecture 1.4), does not yield a better than logarithmic improvement to Theorem 1.2.Our result, thus, does not seem to throw any light on the truth of Conjecture 1.4.For further discussion on this topic, the reader is referred to Section 3.
uniquely written as The right-hand side of Equation (2.1) is called the Fourier expansion of f , and the real coefficients f (γ) are called the Fourier coefficients.
We review some standard definitions and facts about the Fourier coefficients.
Definition 2.1.Let f (x) = ∑ γ∈F n 2 f (γ)χ γ (x) be a Boolean function, and p ≥ 1.The p-th spectral norm f p of f is defined as The 1st spectral norm of a Boolean function can be bounded in terms of sparsity as follows.
Claim 2.3.For a Boolean function f with Fourier sparsity s, f 1 ≤ √ s. Proof.
The first inequality is the Cauchy-Schwarz inequality while the second equality follows from Lemma 2.2.
In order to prove our results, we shall use the following version of the Uncertainty Principle.See, e.g., [5] for a proof.Theorem 2.4 (Uncertainty Principle).Let p : R n → R be a real multilinear non-zero n-variate polynomial with sparsity s (i.e., it has s monomials with non-zero coefficients).Let U n denote the uniform distribution on {1, −1} n .Then As mentioned in the Introduction, we need the following result by Tsang et al. [14,Lemma 28].
Then there is an affine subspace V of F n 2 of codimension O(A) such that f is constant on V .Lemma 1.3 is a simple corollary of this theorem via Claim 2.3.In our proof we crucially use the observation made by Gopalan et al. [4] about the equivalence of the non-adaptive parity decision tree complexity of a function (defined below) and its Fourier dimension.We state it in Proposition 2.7.Definition 2.6 (Non-adaptive parity decision tree complexity).Let f be a Boolean function.The nonadaptive parity decision tree complexity of f (denoted by NADT ⊕ ( f )) is defined as the minimum integer t such that there exist γ 1 , . . ., γ t ∈ F n 2 such that f is determined by the evaluation of the characters χ γ 1 , . . ., χ γ t .Proposition 2.7 (Gopalan et al. [4]).For every Boolean function f , NADT ⊕ ( f ) = dim( f ).

Upper bound on parity decision tree complexity
In this section, we prove an upper bound on the non-adaptive parity decision tree complexity of a Boolean function f in terms of its Fourier sparsity s.For B ⊆ F n 2 , let f | B denote the restriction of the function f to the set B. We will often identify vectors γ ∈ F n 2 with the linear function that maps a vector x ∈ F n 2 to γ(x) := ∑ n i=1 γ i x i mod 2 ∈ F 2 .We first formally present a procedure NADT that constructs a non-adaptive parity decision tree of f .We then provide a description of the procedure in words (in particular, the roles of the various variables used) and a formal analysis of the same, leading up to the proof of Theorem 1.2.
NADT( f ) Output: A set Γ of parities, whose evaluations determine the value of f ; Let g be a function in F with the largest Fourier sparsity.Let A be a largest affine subspace on which g is constant (breaking ties arbitrarily), with codimension n g .Let γ 1 , . . ., γ n g ∈ F n 2 and b 1 , . . ., b n g ∈ F 2 be such that 3. Return Γ.
Notation.After each iteration of the while loop in the procedure, Γ is the set of parities that have been queried so far, F is the set of all restrictions of f to the affine subspaces obtained by different assignments to parities in Γ, and S is the union of the Fourier supports of functions in F. Throughout this section, i will stand for the index of an arbitrary iteration of NADT.Let Γ (i) , F (i) and S (i) denote Γ, F and S respectively at the end of the i-th iteration of the while loop.Let Γ (0) = ∅, F (0) = { f } and S (0) = supp( f ).
and let V b be the affine subspace defined by the linear constraints {γ(x) = b γ : γ ∈ Γ (i) }.In V b , more than one linear function of the original space may get identified, namely, the restrictions of these linear functions to the sub-domain subdomain V b are either the same function or negations of each other.More specifically, δ 1 and δ 2 get identified in V b if and only if δ 1 + δ 2 ∈ span Γ (i) (i.e., they belong to the same coset of the subspace span Γ (i) ).This equivalence relation induces a partition on supp( f ) into equivalence classes.Note that this partition is determined by THEORY OF COMPUTING, Volume 15 (11), 2019, pp.1-13 Γ (i) alone.Further, for each equivalence class, for every b ∈ F , the linear functions belonging to that class get identified with one another in V b .
Let (i) denote the number of such equivalence classes. 6For all j ∈ [ (i) ], let β (i) j be an arbitrarily picked representative element in the j-th equivalence class.Let j elements in the j-th equivalence class, where α are in span Γ (i) .Now define functions Note that the functions P (i) j are non-zero.As we will soon see, it will be helpful to think of the P (i) j 's as multilinear polynomials in variables y i = (−1) x i .
Given this notation, we can then write the Fourier expansion of f in the following form: Note that the value of each P (i) j is fixed once the evaluation of each linear form γ ∈ Γ (i) is specified.In other words, each P  Proof.The size of the Fourier support of f is the sum of the sizes of its equivalence classes defined above.Proposition 3.2.|S (i) | = (i) .Proof.Clearly |S (i) | ≤ (i) , as each term in the Fourier expansion of f | V b corresponds to a distinct equivalence class (and this correspondence is independent of b).Now, for j ∈ [ (i) ], since P (i) j is a non-zero function, there exists an assignment b to the parities in Γ (i) on which P (i) j evaluates to a non-zero value.Thus the coefficient of β (i) j is non-zero in the restriction of f to the affine subspace obtained by assigning b to the parities in Γ (i) .Thus for all j ∈ [ (i) ], β We now argue that after every iteration of the while loop, there exists a function h ∈ F (i) which has large Fourier support. in the affine space V b obtained by restricting each γ j to b j , every parity in supp( g (i) ) is identified with some other parity in supp( g (i) ).Since supp( g (i) ) ⊆ S (i−1) , it follows that |S (i) | is at least | supp( g (i) )|/2 less than |S (i−1) |.By Proposition 3.2 this implies ∆ (i) ≥ s (i) /2.Case 2. s (i) = 1 (i.e., g (i) is either a parity function or the complement of a parity function).Since neither f nor 1 − f is a character, we have that s (1) = sparsity( f ) ≥ 2. Hence we conclude that i ≥ 2. Note that for i ≥ 1, / 0 ∈ S (i) .Hence for i ≥ 2, The lemma follows from the two cases.
Now we are ready to prove Theorem 1.2.
Proof of Theorem 1.2.If either f or 1 − f is a character, the theorem follows immediately.Hence assume that neither f nor 1 − f is a character.We obtain a non-adaptive parity decision tree for f by running NADT.Assume that the while loop runs for t iterations.Let the number of queries made in step 2a in the i-th iteration of the procedure be q (i) .By Lemma 1.3, q (i) = O( √ s (i) ).By Lemma 3.4, ∆ (i) ≥ s (i) /2.Hence, .
From Lemma 3.3 we have s 1) .Thus the total number of queries made within the while loop of the procedure is From Proposition 2.7 it follows that dim( f ) = O( √ s log s).
Discussion.As mentioned in the introduction, a potential approach towards disproving Conjecture 1.4 is to assume it to be true, and prove that it implies a o( √ s) upper bound on Fourier dimension.This will refute the conjecture, since, for the address function (see Section 1), dim( Add s ) = Θ sparsity( Add s ) .
THEORY OF COMPUTING, Volume 15 (11), 2019, pp.1-13 However, we cannot disprove the conjecture by an analysis of the NADT, assuming the conjecture.To see this let us consider the execution of NADT on the address function.Recall that Add s (x, y 1 , y 2 , . . ., y √ s ) = y x , x ∈ {0, 1} (1/2) log s , y i ∈ {0, 1}.One easily sees that a largest affine subspace V on which the function is constant is the one defined by the constraints x = x , y x = b where x ∈ {0, 1} (1/2) log s and b ∈ {0, 1}.The function takes the value b everywhere in V .Also, if the address bits x and the bit y x are set to other values than x and b, then the restricted functions in the respective affine subspaces are all constants (if x is set to x ) or dictators (on the addressed bits).The subsequent steps query different dictators on the addressed bits.
The address function clearly satisfies Conjecture 1.4, and all the intermediate functions that are given rise to by NADT are dictators, which also trivially satisfy the conjecture.Thus this rules out the possibility of refuting Conjecture 1.4 by analyzing NADT assuming the conjecture.We note, however, that if we assume the conjecture, then we can improve the upper bound by a factor of O(log s), to the optimal O( √ s).

j
is a constant function on each V b .Observation 3.1.