The Complexity of Deciding Statistical Properties of Samplable Distributions

We consider the problems of deciding whether the joint distribution sampled by a given circuit has certain statistical properties such as being i. i. d., being exchangeable, being pairwise independent, having two coordinates with identical marginals, having two uncorrelated coordinates, and many other variants. We give a proof that simultaneously shows all these problems are C = P-complete, by showing that the following promise problem (which is a restriction of all the above problems) is C = P-complete: Given a circuit, distinguish the case where the output distribution is uniform and the case where every pair of coordinates is neither uncorrelated nor identically distributed. This completeness result holds even for samplers that are depth-3 circuits. We also consider circuits that are d-local, in the sense that each output bit depends on at most d input bits. We give linear-time algorithms for deciding whether a 2-local sampler's joint distribution is fully independent, and whether it is exchangeable. We also show that for general circuits, certain approximation versions of the problems of deciding full independence and exchangeability are SZK-complete. We also introduce a bounded-error version of C = P, which we call BC = P, and we investigate its structural properties.


Introduction
Testing for independence of random variables is a fundamental problem in statistics.Theoretical computer scientists have studied this and other analogous problems from two main viewpoints.The first viewpoint is property testing of distributions, which is a black-box model in which a tester is given samples and tries to distinguish between some statistical property being "close" or "far" from satisfied.Some important works giving upper and lower bounds for property testing of distributions include [6,5,4,7,29,2,28,35,30,26,14,13].
The other viewpoint is the white-box model in which a tester is given a description of a distribution (from which it could generate its own samples).This could potentially make some problems easier, but there are complexity-theoretic results showing that several such problems are computationally hard, particularly when the input is a succinct description of a distribution.One of the most general and natural ways to succinctly specify a distribution is to give the code of an efficient algorithm that takes "pure" randomness and transforms it into a sample from the distribution.(This gives a polynomial-size specification of a distribution over a potentially exponential-size set.)For arbitrary circuit samplers, the papers [31,20,21,40] contain completeness results for various approximation problems concerning statistical distance, Shannon entropy, and min-entropy.See [22] for a survey of both the black-box and the white-box viewpoints.
In this paper we consider a wide array of "exact" problems concerning statistical properties of the joint distribution produced by a given sampler.Such problems include deciding whether the joint distribution is i. i. d., exchangeable, pairwise independent, and many other variants.Exchangeability is a very important and useful concept with many different applications in pure and applied probability [25,1], but it has been less-often studied in the theoretical computer science community.A joint distribution over a finite domain is called exchangeable if it is invariant under permuting the coordinates.It is fairly straightforward to see that a finite distribution is exchangeable iff it is a mixture of distributions that arise from drawing a sequence of colored balls without replacement from an urn1 [16].When each coordinate is a single bit, exchangeability is equivalent to the probability of a string only depending on the Hamming weight.We feel it is natural to pose complexity-theoretic questions about exchangeability.
We prove that the aforementioned wide array of problems, and more generally a single problem we call PANOPTIC-STATS which is at most as hard as any of those problems, are complete for the complexity class C = P.This class was introduced in [38] as part of the counting hierarchy, and it can be viewed as a class that captures "exact counting" of NP witnesses.The class C = P is at least as hard as the polynomial-time hierarchy, since PH ⊆ BP • C = P [33] and even PH ⊆ ZP • C = P [32].It is at most as hard as "threshold counting," since C = P ⊆ PP, and it is not substantially easier, since PP ⊆ NP C = P .The class C = P has been given several names and characterizations; it equals the classes2 coNQP [18] and ES [11].
In many areas of complexity theory, when arbitrary small-size circuits are too unwieldy to reason about, we restrict our attention to more stringent complexity measures that are combinatorially simple enough to reason about and obtain unconditional results.The two major categories of such complexity measures are parallel time, and space.One model of efficient parallel time computation is AC 0 (constantdepth unbounded fan-in circuits with AND, OR, and NOT gates).Papers that study AC 0 circuits that sample distributions include [36,27,37,8].Another (generally more restrictive) model of efficient parallel time computation is locally-computable functions, where each output bit depends on at most a bounded number of input bits.Papers that study locally-computable functions as samplers include [36,17,15,37,40] as well as a large collection of papers investigating the possibility of implementing pseudorandom generators locally.(See [15] for an extensive list of past work on the power of locally-computable functions, including whether they can implement PRGs, one-way functions, and extractors.)The most common model for logarithmic-space samplers is one with streaming/one-way access to the pure random input bits.Topics that have been studied concerning such logspace samplers include compression [34], extraction [24], and min-entropy estimation [40].One more paper worth mentioning is [10], which considers Markov random fields as succinct descriptions of distributions (though these descriptions would not be considered "samplers").
We prove that our C = P-completeness results hold even when restricted to samplers that are AC 0 -type circuits with depth 3 and top fan-in 2 (i.e., each output gate has fan-in at most 2).We also consider 2-local samplers (where each output bit depends on at most 2 of the pure random input bits) such that each coordinate of the sampled joint distribution is a single bit.We give polynomial-time (in fact, linear-time) algorithms for deciding whether such a sampler's distribution is fully independent, and whether it is exchangeable.These seem to be the first-of-a-kind algorithmic results on deciding statistical properties of succinctly described distributions.
We also consider approximate versions of the problems discussed above: deciding whether the joint distribution of a given sampler is statistically close to or far from satisfying a property.It was shown in [20] that for the property of being uniform, the problem is complete for the class NISZK (non-interactive statistical zero-knowledge).It was shown in [31] that the problem of deciding whether a pair of samplable distributions are statistically close or far is complete for the class SZK (statistical zero knowledge).We prove that with suitable parameters, the approximate versions of the full independence and exchangeability problems (for general circuit samplers) are also SZK-complete.
In this paper we also consider a "bounded-error" version of C = P, which we call BC = P and which does not seem to have been defined or studied in the literature before.Although it does not appear to be directly relevant to statistical properties of samplable distributions, we take the opportunity to study this class and prove that it is closed under several operations (disjunction, conjunction, union, and intersection).

Results
If D is a joint distribution over ({0, 1} k ) n , we let D i (for i ∈ {1, . . ., n}) denote the i th coordinate, which is marginally distributed over {0, 1} k .For each of the computational problems we consider, the input is a circuit S : {0, 1} r → ({0, 1} k ) n (and we assume that the values of k and n are part of the description of the circuit).We call such a circuit a (k, n)-sampler, and if it has size ≤ s we also call it a (k, n, s)-sampler.Plugging a uniformly random string into S yields a joint output distribution, which we denote by S(U).We formulate computational problems using the framework of promise problems.Throughout this paper, when we talk about reductions and completeness, we are always referring to Karp reductions (polynomial-time many-one reductions).We refer to the texts [3,19] for expositions of standard complexity classes and completeness.
We state our completeness results for exact problems in Section 2.1 and prove them in Section 3. We state our algorithmic results for exact problems in Section 2.2 and prove them in Section 4. We state our completeness results for approximate problems in Section 2.3 and prove them in Section 5. We consider a new complexity class, BC = P, in Section 6, and we list some open problems in Section 7.

Exact completeness results
For a joint distribution D over ({0, 1} k ) n , we say that D i , D j are uncorrelated if they have covariance 0, in other words E( interpreted as binary representations of integers from 0 to 2 k − 1).Uncorrelated is the same as independent if k = 1.We consider the following extreme notion of a distribution being nonuniform.Definition 2.1.A joint distribution is discordant if there are ≥ 2 coordinates and every pair of coordinates is neither uncorrelated nor identically distributed.Definition 2.2.PANOPTIC-STATS is the following promise problem.

PANOPTIC-STATS
We say that promise problem Π is a generalization of promise problem Π , or that Π is a restriction of Π, if Π YES ⊆ Π YES and Π NO ⊆ Π NO .
Fact 2.3.PANOPTIC-STATS is generalized by all the following languages, which are defined in a natural way.

UNIFORM, IID, FULLY-INDEPENDENT, IDENTICALLY-DISTRIBUTED, EXCHANGEABLE,
For example, S ∈ UNIFORM ⇐⇒ S(U) is uniform.Also, K ≥ 2 is any constant (unrelated to k).Technical caveat: To ensure the K-WISE-and K-EXISTS-problems generalize PANOPTIC-STATS, they are defined in terms of a property holding for every or some (respectively) set of min(K, n) coordinates.
We prove that PANOPTIC-STATS and all the languages listed in Fact 2.3 are complete for the complexity class C = P (defined below).In fact, the C = P-hardness of each of the individual languages in Fact 2.3 is fairly simple to prove, but the C = P-hardness of PANOPTIC-STATS shows two things: (1) that this phenomenon is very robust, not dependent on some fragile aspects of the properties being decided, and (2) that only one proof is needed to show the C = P-hardness of all the languages in Fact 2.3.
To prove the C = P-hardness of PANOPTIC-STATS, it suffices to prove hardness for the case n = 2.However, hardness for n = 2 does not seem to directly imply hardness for a larger number of coordinates; THEORY OF COMPUTING, Volume 11 (1), 2015, pp.  it is desirable to prove hardness even when restricted to samplers that are small in terms of the number of coordinates n.We formalize this by introducing a new parameter m and viewing k, n, s as functions of m.Thus m can be thought of as indexing a family of parameter settings.Definition 2.4.We say that a triple of functions κ(m), ν(m), σ (m) : N → N is polite if the functions are monotonically nondecreasing, polynomially bounded in m, computable in time polynomial in m, and σ (m) ≥ m.Definition 2.5.PANOPTIC-STATS κ,ν,σ is the restriction of PANOPTIC-STATS to (k, n, s)-samplers with k = κ(m), n = ν(m), and s ≤ σ (m) for some m, where κ, ν, σ is assumed to be polite.
We now state the definition of our central complexity class, C = P.We use a standard model of computation in which randomized algorithms have access to independent unbiased coin flips.Definition 2.6.prC = P is the class of all promise problems for which there exists a polynomial-time randomized algorithm M that accepts with probability 1/2 on YES instances, and accepts with probability = 1/2 on NO instances.Also, C = P is defined as the class of languages in prC = P. Proposition 2.7.prC = P is the class of all promise problems Karp-reducible to the following promise problem, UNIFORM-BIT.UNIFORM-BIT YES = S : S is a (1, 1)-sampler and S(U) is uniform , UNIFORM-BIT NO = S : S is a (1, 1)-sampler and S(U) is nonuniform .
Proof.Suppose Π ∈ prC = P is witnessed by M taking an input x and a uniformly random string y of some polynomial length.To reduce Π to UNIFORM-BIT, map x to S x where S x (y) = M(x, y).Conversely, suppose Π reduces to UNIFORM-BIT.Then Π ∈ prC = P is witnessed by M that takes x, runs the reduction to get a (1, 1)-sampler S x , and runs S x on a uniformly random input.
We mention that some problems concerning conditional independence are also C = P-complete.For example, deciding whether the first n − 1 coordinates of S(U) are fully independent conditioned on the last coordinate is at least as hard as the corresponding non-conditional problem.Another problem concerning conditional independence is whether S(U) forms a (time-inhomogeneous) Markov chain (assuming n ≥ 3).The construction in our proof of Theorem 2.8 also shows that this problem is C = P-hard.Both these problems are in C = P by the same techniques used in the proof of Theorem 2.10.

Exact algorithmic results
We say a (k, n, s)-sampler is d-local if each of the kn output bits depends on at most d of the uniformly random input bits.For d-local samplers, if dk ≤ O(log s) then some statistical properties, such as being pairwise independent or having identically distributed marginals, can be decided trivially in polynomial time.We now prove that some other properties, namely being fully independent or being exchangeable, can be decided in polynomial time when d = 2 and k = 1.(Admittedly, our algorithms are not very "algorithmic"; we prove combinatorial characterizations for which it is trivial to check whether a given sampler satisfies the characterization.)Theorem 2.11.There exists a linear-time algorithm for deciding whether the joint distribution of a given 2-local (1, n)-sampler is fully independent.Theorem 2.12.There exists a linear-time algorithm for deciding whether the joint distribution of a given 2-local (1, n)-sampler is exchangeable.
When d = 2 and k = 1, we can also improve the efficiency of the trivial quadratic-time algorithm for deciding pairwise independence.Theorem 2.13.There exists a linear-time Karp reduction from the problem of deciding whether the joint distribution of a given 2-local (1, n)-sampler is pairwise independent, to the element distinctness problem.Hence the former problem can be solved in deterministic O(n log n) time and in zero-error randomized expected linear time.
One can also consider logspace samplers that have streaming/one-way access to their random input bits, and which are usually modeled as layered read-once branching programs representing a certain type of (time-inhomogeneous) Markov chain.For logspace samplers, some statistical properties, such as being pairwise independent or having identically distributed marginals, can be decided in polynomial time by straightforward dynamic programming algorithms; the complexities of deciding full independence and exchangeability remain open.

Approximate completeness results
We quantify approximation in terms of statistical distance (also known as total variation distance).
We say D (1) , D (2) are c-close if D (1) − D (2) ≤ c, and f -far if D (1) − D (2) ≥ f .THEORY OF COMPUTING, Volume 11 (1), 2015, pp.1-34 We prove that for appropriate parameters, approximate versions of the full independence and exchangeability problems are prSZK-complete (for arbitrary circuit samplers).We do not reproduce the original definition of prSZK, but we make use of the characterization of this class proved by Sahai and Vadhan [31].The following is our general formulation of the approximate full independence problem.Definition 2.15.For functions 0 ≤ c(k, n, s) < f (k, n, s) ≤ 1, FULLY-INDEPENDENT c, f is the following promise problem. 3ULLY-INDEPENDENT c, f YES = S : S is a (k, n, s)-sampler and S(U) is c(k, n, s)-close to some fully independent distribution over We have, for example, that FULLY-INDEPENDENT 0.05/(n+1), 0.24 is prSZK-complete.The containment in prSZK follows from Theorem 2.17.Although the prSZK-hardness does not follow from the statement of Theorem 2.16, the proof indeed yields this; we stated Theorem 2.16 using constants for the sake of simplicity and clarity.(It is open to prove Theorem 2.17 with constant c.) Consequently, for example, EXCHANGEABLE 0.12, 0.49 is prSZK-complete.

Proofs of exact completeness results
We prove a key lemma in Section 3.1.Then we use the key lemma to prove Theorem 2.8 and Theorem 2.9 in Section 3.2.Then we prove Theorem 2.10 in Section 3.3.

The key lemma
The following is the key lemma in the proof of Theorem 2.8.It can be interpreted qualitatively as a certain type of amplification.

Lemma 3.1.
There is an algorithm that takes as input a (1, 1, s)-sampler S and an integer n ≥ 2, runs in time O(n + s), and outputs a (1, n, O(n + s))-sampler T such that the following both hold.
Proof.Let T perform the following computation.
run S and let b be its output choose bits a 1 , a 2 , . . ., a n uniformly at random if there exists an < n such that a = 0 then let * be the least such output a 1 , . . ., a * , b, a * +2 , . . ., a n else output a 1 , . . ., a n It is straightforward to see that if S(U) is uniform then T (U) is uniform.Now suppose S(U) is nonuniform, say Pr[S(U) = 1] = p = 1/2.For brevity we define D = T (U).Consider any two coordinates D i and D j where i < j.For technical reasons in the analysis below, if * does not exist then we define * to be an arbitrary value > n.
We first show that D i and D j are not identically distributed.If i > 1 then Similarly, Since p = 1/2, and since Pr[D i = 1] and Pr[D j = 1] are different convex combinations of p and 1/2, that means they are not equal.More formally, On the other hand, suppose i = 1.Then Pr[D i = 1] = 1/2, and Pr[D j = 1] is a nontrivial convex combination of p and 1/2 and is thus not equal to Pr[D i = 1].In either case, D i and D j are not identically distributed.Now we show that D i and D j are correlated.Suppose j = i + 1.Then Pr D j = 1 D i = 1 = 1/2, and 0 anyway so the final equation above still holds.)It follows that and

It follows that
Pr In either case, D i and D j are correlated since Lemma 3.2.Lemma 3.1 holds even when T is required to be an AC 0 -type circuit with depth 3 and top fan-in 2, except that the size of T and the running time of the algorithm both become O(n 2 + s).
Proof.The construction and analysis are the same as in the proof of Lemma 3.1, but we need more care in implementing T .First, we use a standard reduction to convert S into a 3-CNF F that accepts the same number of inputs as S (but has more input bits).Thus, for some polynomially large q, S accepts a uniformly random input with probability 1/2 iff F accepts a uniformly random input with probability 1/2 q .Let x 1 , x 2 , . . ., x r denote the input bits of F. Construct a new CNF F with input bits x 0 , x 1 , . . ., x r by taking F and including x 0 in each of the clauses (yielding a 4-CNF), then adding a new clause it follows that F accepts with probability 1/2 q iff F accepts with probability 1/2.Now to implement T , we include a copy of F as well as the random input bits a 1 , a 2 , . . ., a n .The 1 st output bit of T is just a 1 .For the i th output bit when i > 1, we have a multiplexer that selects the output of ) is true, and selects a i otherwise.Overall, T is an OR-AND-OR circuit (with negations pushed to the inputs) where each output gate has fan-in at most 2.

prC = P-hardness
We need one final corollary before we are ready to put the pieces together to prove Theorem 2.8 and Theorem 2.9.
Corollary 3.3.Lemma 3.1 and Lemma 3.2 also hold when the algorithm is additionally given an integer k ≥ 1 and is required to output a (k, n)-sampler T , except that the size of T and the running time of the algorithm both become O(kn + s) (for Lemma 3.1) or O(kn + n 2 + s) (for Lemma 3.2).
Proof.If T is the output of the algorithm from Lemma 3.1 or Lemma 3.2, we can trivially modify it into a sampler T that prepends independent uniformly random bit strings of length k − 1 to the n coordinates.
In the YES case, T (U) is still uniform.Consider the NO case.The property that no two coordinates are identically distributed is inherited from T .To see that coordinates T (U) i , T (U) j are still correlated, abbreviate T (U) as D and T (U) as D , and let D i = D i + I and D j = D j + J where I, J are independent uniformly random even numbers in the range {0, . . ., 2 k − 2}, and note that Proof of Theorem 2.8.We reduce UNIFORM-BIT to PANOPTIC-STATS κ,ν,σ .Let c be the constant factor in the big O in Corollary 3.
The reduction's running time is polynomial since m, κ(m), ν(m), σ (m) are all polynomially bounded in s and computable in time polynomial in s, and since the algorithm from Corollary 3.3 runs in time O(kn + s).
Proof of Theorem 2.9.We reduce UNIFORM-BIT to PANOPTIC-STATS κ,ν,σ restricted as in the statement of Theorem 2.9.Let c be the constant factor in the big O in Corollary 3.
NO .

Containment in C = P
In the proof of Theorem 2.10 we use the following lemma, which states that C = P is closed under exponential conjunctions and polynomial disjunctions.We supply a folklore proof of this lemma in Section A.1.
Lemma 3.4.If L ∈ C = P then both of the following hold.
Proof of Theorem 2.10.The arguments are very similar, so we just give three representative examples: FULLY-INDEPENDENT, K-WISE-EXCHANGEABLE, and 2-EXISTS-UNCORRELATED.First we mention a useful tool: If S 1 , S 2 are (1, 1)-samplers, then we define Equ(S 1 , S 2 ) to be a (1, 1)-sampler that picks i ∈ {1, 2} uniformly at random, runs S i , and negates the where, if we view S as (say) a (k, n)-sampler, and y as (an appropriately encoded description of) an element of ({0, 1} k ) n (so q is linear in the size of S), then Thus by Lemma 3.4 it suffices to show that L ∈ C = P.A reduction from L to UNIFORM-BIT just outputs Equ(S 1 , S 2 ), where S 1 runs S and accepts iff the output is y, and S 2 runs S for n times and accepts iff for all i, the i th coordinate of the output of the i th run is y i .Now we prove that K-WISE-EXCHANGEABLE ∈ C = P.Note that K-WISE-EXCHANGEABLE = ∀ q L where, if we view S as (say) a (k, n)-sampler, and y = (I, π, w) as (an appropriately encoded description of) a subset I ⊆ {1, . . ., n} of size min(K, n), a permutation π on {1, . . ., min(K, n)}, and an element w ∈ ({0, 1} k ) min(K,n) (so q is certainly polynomial in the size of S), then where S(U) I is the restriction to coordinates indexed by I, and π(w) ∈ ({0, 1} k ) min(K,n) is obtained by permuting the coordinates of w by π.Thus by Lemma 3.4 it suffices to show that L ∈ C = P.A reduction from L to UNIFORM-BIT just outputs Equ(S 1 , S 2 ), where S 1 runs S and accepts iff the output restricted to I is w, and S 2 runs S and accepts iff the output restricted to I is π(w).
Now we prove that 2-EXISTS-UNCORRELATED ∈ C = P.Note that if we define the language L = (S, i, j) : S(U) i and S(U) j are uncorrelated , Thus by Lemma 3.4 it suffices to show that L ∈ C = P.A reduction from L to UNIFORM-BIT just outputs Equ(S 1 , S 2 ), where S 1 runs S yielding some y ∈ ({0, 1} k ) n and accepts with probability and S 2 runs S twice (independently) yielding some y (1) and y (2) and accepts with probability

Proofs of exact algorithmic results
We prove Theorem 2.11, Theorem 2.12, and Theorem 2.13 in Section 4.1, Section 4.2, and Section 4.3, respectively.
First we introduce some terminology to describe 2-local samplers.Each output bit depends on either zero, one, or two input bits.Output bits that depend on zero input bits are constants (0 or 1).The nonconstant output bits can be modeled with an undirected graph (multi-edges and self-loops allowed) as follows.The input bits are indexed by the nodes.Each output bit depending on one input bit is a self-loop, labeled with a function from {0, 1} to {0, 1} (either the identity or negation).Each output bit depending on two input bits is an edge between those two nodes, labeled with a function from {0, 1} 2 to {0, 1}.There are three types of such functions that depend on both bits: AND-type (accepting one of the four inputs), XOR-type (accepting two of the four inputs), and OR-type (accepting three of the four inputs).

Full independence for 2-local samplers
We prove Theorem 2.11.Consider a 2-local (1, n)-sampler S, and assume without loss of generality that S has no constant output bits.We claim that S(U) is fully independent iff both of the following conditions hold.
(i) The graph is a forest, ignoring self-loops.
(ii) Each connected component of the graph has at most one of the following: a self-loop, an AND-type edge, or an OR-type edge.
It is trivial to check in linear time whether these conditions hold.First we assume that (i) and (ii) both hold, and show that S(U) is fully independent.The different connected components of the graph are certainly fully independent of each other, so we can focus on showing that the coordinates of a single connected component are fully independent.If there is a self-loop, an AND-type edge, or an OR-type edge in the connected component, then let e be that edge.Otherwise, let e be any edge in the connected component.We show that conditioned on e evaluating to any particular THEORY OF COMPUTING, Volume 11 (1), 2015, pp.1-34 bit, the joint distribution of the remaining edges in e's connected component is uniform.This implies that the whole joint distribution of the connected component is fully independent.
Suppose e is a self-loop at some node v, so we are conditioning on v being some particular bit.Ignoring e itself, we can view e's connected component as a tree rooted at v with only XOR-type edges.After the conditioning, there is a bijection between the set of all assignments of values to the edges (excluding e) and the set of all assignments of values to the nodes (excluding v) in e's connected component: An assignment to nodes (together with the conditioned value of v) determines an assignment to edges.Furthermore, every assignment to edges arises from some assignment to nodes, because for any assignment to edges, we can start at v and work our way downward to the leaves, uniquely specifying the value of each node in terms of the values of its parent and the edge to its parent.Since the sets have the same size, we have exhibited a bijection between them.This means that conditioned on either value of e, the joint distribution of all the other edges in e's connected component is uniform.
Now suppose e = {u, v} is not a self-loop.We show that, in fact, conditioned on any one of the four assignments of values to the pair u, v, the joint distribution of all the other edges in e's connected component is uniform.Removing e results in two new connected components, each of which is a tree of XOR-type edges, one rooted at u and the other rooted at v. Let U denote the set of nodes in u's new connected component excluding u itself, and let V denote the set of nodes in v's new connected component excluding v itself.By the argument from the previous paragraph (when e was a self-loop), a uniformly random assignment to U induces a uniformly random assignment to the edges in u's new connected component, and similarly for V .Since assignments to U and V are chosen independently of each other, this means that the values of all the edges in e's original connected component (except e itself) are jointly uniformly distributed (conditioned on any particular assignment to u, v, and hence conditioned on any particular assignment to e).
Now we prove the converse by assuming that (i) and (ii) do not both hold, and showing that S(U) is not fully independent.Let us refer to self-loops, AND-type edges, and OR-type edges as non-XOR-type edges.If (i) and (ii) do not both hold, then at least one of the following conditions holds.
(A) There is a cycle consisting entirely of XOR-type edges.
(B) There is a cycle with exactly one AND-type edge or OR-type edge.
(C) There is a path between two non-XOR-type edges.Suppose (A) holds.Let e be an edge on the cycle.Then e's marginal distribution is uniform, but conditioning on any particular values of the other edges on the cycle determines whether or not e's endpoints are the same bit as each other, and thus fixes the value of e. Hence S(U) is not fully independent.Suppose (B) holds.Let denote the number of nodes on the cycle.Then the probability that all edges on the cycle evaluate to 1 must be an integer multiple of 1/2 (since they only depend on input bits), but the product of the marginal probabilities that each edge on the cycle evaluates to 1 must be either 1/2 +1 (if there is an AND-type edge) or 3/2 +1 (if there is an OR-type edge).Hence S(U) is not fully independent.Suppose (C) holds.Without loss of generality, all intermediate edges on the path are XOR-type.Let e 1 and e 2 be the two non-XOR-type edges, which we consider to be part of the path.Let denote the number of nodes on the path.Then the probability that all edges on the path evaluate to 1 must be an integer multiple of 1/2 (since they only depend on input bits), but the product of the marginal probabilities THEORY OF COMPUTING, Volume 11 (1), 2015, pp.1-34 that each edge on the path evaluates to 1 must be either 1/2 +1 (if neither e 1 nor e 2 is OR-type) or 3/2 +1 (if exactly one of e 1 , e 2 is OR-type) or 9/2 +1 (if both e 1 and e 2 are OR-type).Hence S(U) is not fully independent.

Exchangeability for 2-local samplers
We prove Theorem 2.12.We begin with a lemma.Lemma 4.1.A joint distribution D over ({0, 1} 1 ) n is exchangeable iff both of the following conditions hold.
(1) The marginals D i are all identically distributed.
(2) For all i = j, if Pr[D i = D j ] > 0 then the joint distribution of the other n − 2 coordinates is the same when conditioned on (D i = 1, D j = 0) as it is when conditioned on (D i = 0, D j = 1).
Proof of Lemma 4.1.Suppose D is exchangeable.Then (1) holds trivially.To see that (2) holds, consider i = j such that Pr[D i = D j ] > 0. First note that since D i , D j are identically distributed, For some arbitrary particular bits b h (for h ∈ {i, j}), let E denote the event that D h = b h for all h ∈ {i, j}.

Then we have
Pr where Pr E and D i = 1, D j = 0 = Pr E and D i = 0, D j = 1 holds by exchangeability.This shows that (2) holds.
For the converse, suppose (1) and (2) both hold.Since every permutation is a composition of transpositions, it suffices to show that the joint distribution is invariant under transposing coordinates.Let D be obtained from D by transposing some coordinates i = j.For some arbitrary particular bits b h (for h ∈ {1, . . ., n}), let E denote the event that D h = b h for all h ∈ {1, . . ., n}, and let E denote the event that This finishes the proof of Lemma 4.1.
Now we present the proof of Theorem 2.12.Consider a 2-local (1, n)-sampler S. If condition (1) from Lemma 4.1 does not hold for S(U), then we can reject outright.Otherwise, there are five cases corresponding to the marginal probability that any particular coordinate is 1.
Case 0: If all output bits of S are constant 0 then S(U) is trivially exchangeable.
Case 1/4: In this case, each edge of the graph is AND-type.When two AND-type edges share an endpoint, we say that they agree on the endpoint if the unique assignments that make the two edges evaluate to 1 agree on the value of the node.We assume without loss of generality that the graph has no nodes of degree 0. We claim that S(U) is exchangeable iff at least one of the following conditions holds.
(i) The edges are all disjoint.
(ii) The graph is a star, and all edges agree on the central node. 4iii) The graph is a triangle, and there is agreement at all nodes. 5iv) The graph is a triangle, and there is disagreement at all nodes.
(v) There are only three nodes u, v, w, and there are no {u, w} edges, there are at most two {u, v} edges and they agree on v and disagree on u, there are at most two {v, w} edges and they agree on v and disagree on w, and the {u, v} edges disagree with the {v, w} edges on v.
(vi) There are only two nodes, and no two edges agree on both nodes.
(vii) There are only two nodes, and all edges agree on both nodes.
It is trivial to check in linear time whether at least one of these conditions holds.
First we assume at least one of the conditions holds, and argue that S(U) is exchangeable.If (i) holds then S(U) is i. i. d.where each coordinate has 1/4 probability of being 1, and this is exchangeable.If (ii) holds then S(U) is a uniform mixture of the uniform distribution and the constant all 0's distribution, so S(U) is exchangeable since it is a mixture of i. i. d.'s.If (iii) holds then outputs of Hamming weight 1 each have probability 1/8, and outputs of Hamming weight 2 each have probability 0, so S(U) is exchangeable since the probability of an output only depends on the Hamming weight.If (iv) or (v) or (vi) holds then an edge evaluating to 1 forces all other edges to evaluate to 0, so S(U) is all 0's with probability 1 − (n/4) and is otherwise uniformly distributed on strings of Hamming weight 1, so S(U) is exchangeable.If (vii) holds then S(U) is all 1's with probability 1/4 and all 0's with probability 3/4, which is exchangeable.
We prove the converse with two lemmas, which show that the following conditions are the only obstacles to exchangeability.We write ∃e 1 , e 2 , e 3 with the tacit assumption that these are three distinct edges.
(B) ∃e 1 , e 2 , e 3 all sharing an endpoint on which e 1 , e 2 disagree, and such that e 3 does not share its other endpoint with e 1 or with e 2 .
(C) ∃e 1 , e 2 , e 3 such that e 1 , e 3 share both endpoints, and e 2 shares exactly one endpoint with them, and they all agree on the common node.
(E) ∃e 1 , e 2 , e 3 such that e 1 , e 3 share and agree on both endpoints, and e 2 either does not share both endpoints or does not agree on both endpoints.
Proof.Assume none of (i)-(vii) hold.Suppose the graph is not connected.Then since (i) fails, there are two edges that are not disjoint, and there is also another edge not in their connected component, so (A) holds.Henceforth suppose the graph is connected.
Suppose there are at least four nodes.If there is a simple path of length three, then (A) holds, so suppose there is no such path.Then the graph is "star-like," meaning it would be a star if multi-edges were replaced with single edges.If there is not complete agreement on the central node then (B) holds; otherwise, since (ii) fails, there must be multi-edges and so (C) holds.Now suppose there are exactly three nodes and there is a triangle.If there are no multi-edges, then since (iii) and (iv) fail, (D) holds.If there is a multi-edge pair, then either these two edges disagree on some endpoint, in which case (D) holds, or they agree on both endpoints, in which case (E) holds.Now suppose there are exactly three nodes and there is no triangle.Since the graph is connected, there is a length-2 path, say {u, v}, {v, w}.Since (v) fails, either there are two {u, v} edges that disagree on v, in which case (B) holds, or there are two {u, v} edges that agree on both u and v, in which case (E) holds, or analogous situations happen with {v, w}, or all edges agree on v and there are either two {u, v} edges or two {v, w} edges, in which case (C) holds.Note that there cannot be just one {u, v} edge and just one {v, w} edge, since if they agreed then (ii) would hold, and if they disagreed then (v) would hold.
Finally, if there are exactly two nodes then since (vi) and (vii) fail, (E) holds.
Lemma 4.3.If at least one of (A)-(E) holds, then S(U) is not exchangeable.
Proof.Assuming at least one of (A)-(E) holds, we use condition (2) from Lemma 4.1 to refute exchangeability of S(U) by showing that the marginal probability that e 3 = 1 (more precisely, the random variable indexed by e 3 evaluates to 1) changes when we go from conditioning on (e 1 = 1, e 2 = 0) to conditioning on (e 1 = 0, e 2 = 1).If (A) holds and e 1 , e 3 share both endpoints then it goes either from 1 to 0 (if e 1 , e 3 agree on both endpoints) or from 0 to 1/3.If (A) holds and e 1 , e 3 share only one endpoint and e 1 , e 2 are disjoint then it goes either from 1/2 to 1/6 (if e 1 , e 3 agree) or from 0 to 1/3.If (A) holds and e 3 , e 1 , e 2 form a simple path then it goes either from 1/2 to 0 (if e 1 , e 3 agree and e 1 , e 2 agree) or from 1/2 to 1/4 (if e 1 , e 3 agree and e 1 , e 2 disagree) or from 0 to 1/2 (if e 1 , e 3 disagree and e 1 , e 2 agree) or from 0 to 1/4.
If (B) holds then it goes either from 1/2 to 0 (if e 1 , e 3 agree) or from 0 to 1/2.If (C) holds then it goes either from 1 to 0 (if e 1 , e 3 agree on both endpoints) or from 0 to 1.If (D) holds then it goes either from 1 to 0 (if e 1 , e 2 agree) or from 1/2 to 0. If (E) holds then it goes from 1 to 0. Interestingly, the above analysis shows that in Case 1/4, S(U) is "globally exchangeable" iff it is "locally exchangeable" in the sense that every set of three coordinates is exchangeable.
Case 1/2: In this case, each edge of the graph is either XOR-type or a self-loop.We say that XOR-type multi-edges agree if they compute the same function, and similarly we say that multi-self-loops agree if they compute the same function.We assume without loss of generality that the graph has no nodes of degree 0. We claim that S(U) is exchangeable iff at least one of the following conditions holds.
(i) The graph is a forest ignoring self-loops, and there is at most one self-loop per connected component.
(ii) The graph is a simple cycle. 6iii) The graph is a simple path but with two self-loops, one at each end.
(iv) There are only two nodes, no self-loops, and all edges agree.
(v) There are only two nodes, no self-loops, and only two edges, which disagree.
(vi) There is only one node, with agreeing self-loops.
(vii) There is only one node, with only two self-loops, which disagree.
It is trivial to check in linear time whether at least one of these conditions holds.
First we assume at least one of the conditions holds, and argue that S(U) is exchangeable.If (i) holds then S(U) is uniform by the characterization in the proof of Theorem 2.11.If (ii) or (iii) holds then S(U) is the same as conditioning the uniform distribution on having a particular parity, so S(U) is invariant under permuting coordinates (by commutativity and associativity of addition over GF(2)).If (iv) or (vi) holds then S(U) is all 1's with probability 1/2 and all 0's with probability 1/2, which is exchangeable.If (v) or (vii) holds then S(U) is uniform over the two possibilities 01 and 10, which is exchangeable.
We prove the converse with two lemmas, which show that the following conditions are the only obstacles to exchangeability.
(A) ∃ a cycle C of XOR-type edges, and an edge e that is either a self-loop or has at least one endpoint not on C.
(B) ∃ a path P (of zero or more XOR-type edges), two self-loops with one at each end of P, and an edge e at least one of whose nodes is not on P.
(C) ∃ three XOR-type edges all sharing both endpoints, such that some but not all of these edges agree.
(D) ∃ three self-loops all sharing the same node, such that some but not all of these edges agree.
Proof.Assume none of (i)-(vii) hold.If there is only one node, then since (vi) and (vii) fail, (D) holds.
Henceforth suppose there are at least two nodes.Since (i) fails, the graph has either a cycle of XOR-type edges, or two self-loops in the same connected component.
If it has a cycle of XOR-type edges, then let C be a shortest such cycle.Since (ii) fails, there exists another edge.If there exists another edge e that is either a self-loop or has at least one endpoint not on C, then (A) holds.Otherwise, all other edges are XOR-type with both endpoints on C. If C had length at least three, this would contradict the minimal nature of C. Hence there are only two nodes, with no self-loops and with at least three XOR-type edges (two of them forming C).Then since (iv) and (v) fail, (C) holds.
On the other hand, if the graph is a forest ignoring self-loops but has two self-loops in the same connected component, then consider two such self-loops that are closest, and let P be the unique path between them.If P has length zero (so the self-loops are at the same node), then (B) holds since we are assuming there are at least two nodes (and the other node is incident to some edge e).Otherwise, since (iii) fails, there exists another edge e.If e were XOR-type with both endpoints on P, then there would be a cycle of XOR-type edges (contradicting the assumption that the graph is a forest ignoring self-loops), and if e were another self-loop on P then this would contradict the minimal nature of P (since P has length ≥ 1).Thus at least one of e's nodes is not on P, so (B) holds.Lemma 4.5.If at least one of (A)-(D) holds, then S(U) is not exchangeable.
Proof.Assuming at least one of (A)-(D) holds, we use condition (2) from Lemma 4.1 to refute exchangeability of S(U) by exhibiting edges e 1 , e 2 for which the joint distribution of the evaluations of the other edges changes when we go from conditioning on (e 1 = 1, e 2 = 0) to conditioning on (e 1 = 0, e 2 = 1).
If (A) holds then let e 1 = e and e 2 be any edge on C. The joint distribution of the other edges on C (besides e 2 ) goes from being uniform conditioned on having a particular parity to being uniform conditioned on having the opposite parity.
If (B) holds then let e 1 = e and e 2 be one of the two self-loops at the ends of P. The joint distribution of the other edges on P, together with the self-loop at the other end of P, goes from being uniform conditioned on having a particular parity to being uniform conditioned on having the opposite parity.
If (C) or (D) holds then call the edges e 1 , e 2 , e 3 where e 1 , e 2 disagree and e 2 , e 3 agree.Then the marginal probability that e 3 evaluates to 1 goes from 0 to 1.
Case 3/4: In this case, each edge of the graph is OR-type.Let S denote the circuit obtained from S by negating every output bit.Then S(U) is exchangeable iff S(U) is exchangeable.Every edge of the graph for S is AND-type, so we can use the characterization from Case 1/4 to decide in linear time whether S(U) is exchangeable.
Case 1: If all output bits of S are constant 1 then S(U) is trivially exchangeable.

Pairwise independence for 2-local samplers
We prove Theorem 2.13.Consider a 2-local (1, n)-sampler S, and assume without loss of generality that S has no constant output bits.We claim that S(U) is pairwise independent iff both of the following conditions hold.
(i) The graph has no multi-edges.
(ii) For each node v of the graph, there is at most one of the following among the edges incident to v: a self-loop, an AND-type edge, or an OR-type edge.
It is trivial to check in linear time whether condition (ii) holds.Condition (i) is an instance of the element distinctness problem, which is the problem of deciding whether a list of numbers (encoding pairs of nodes, in our situation) has no duplicates.The element distinctness problem can be solved in deterministic O(n log n) time by sorting, and it can be solved in zero-error randomized expected linear time. 7We supply a folklore proof of the following lemma in Section A.2.
Lemma 4.6.The element distinctness problem has a zero-error randomized expected linear-time algorithm.
We now verify that (i) and (ii) characterize pairwise independence.First we assume that (i) and (ii) both hold, and show that the evaluations of two arbitrary edges e 1 , e 2 are independent.If e 1 , e 2 are disjoint then this is immediate; otherwise they share a node v. Since (i) and (ii) hold, the characterization in the proof of Theorem 2.11 implies that the edges incident to v are fully independent of each other; in particular e 1 , e 2 are independent.Conversely, suppose (i) and (ii) do not both hold.A simple case analysis shows that if two edges form a multi-edge pair, or if they share a node and neither is XOR-type, then they cannot be independent, and so S(U) is not pairwise independent.

Proofs of approximate completeness results
We prove Theorem 2.16 and Theorem 2.17 in Section 5.1, and we prove Theorem 2.19 and Theorem 2.20 in Section 5. Without loss of generality, S(U) is independent.
More generally, Sahai and Vadhan proved that for all functions c, f computable in time polynomial in s, the problem STATISTICAL 1) , and is in prSZK 1) .This can be used to improve the parameters (as functions of s) in our theorems.For example, in Theorem 2.16, c can be 2 −s o (1) and f can be (1/4) − 2 −s o (1) .We chose to state our theorems using constants (except Theorem 2.17, where the 1/(n + 1) factor is needed) for simplicity and clarity.It is awkward to handle reductions when the c, f functions depend on the size of one circuit for one problem but on the size of a different circuit for the other problem.

Approximate full independence
We now prove Theorem 2.16 and Theorem 2.17.
First we show that if S ∈ STATISTICAL-DISTANCE 2c,4 f YES then S ∈ FULLY-INDEPENDENT c, f YES .Suppose D 1 − D 2 ≤ 2c.Let D * be the independent distribution whose first coordinate is uniform over {1, 2} and whose second coordinate is D 1 .For any event E ⊆ {1, 2} × {0, 1} k and b ∈ {1, 2}, let This implies that where the third line follows by independence and by the assumption that Pr[D . Hence by the second equality in Definition 2.14, D 1 − D 2 < 4 f .The proof of Theorem 2.17 uses the following lemma.
Lemma 5.4.Suppose D is a distribution over ({0, 1} k ) n .If D is c-close to some fully independent distribution D * , then D is (n + 1)c-close to the distribution D that is fully independent and has the same marginals as D.
Lemma 5.4 can be proven using a simple hybrid argument.The case n = 2 was proven in [5], but the same argument works for general n; we omit the details.
Proof of Theorem 2.17.We reduce FULLY-INDEPENDENT c, f to STATISTICAL-DISTANCE c , f (which is in prSZK by Theorem 5.3).Given a (k, n)-sampler S, construct a (k, n)-sampler S that runs S independently n times and outputs (w 1 , . . ., w n ) where w i is the i th coordinate of the output of the i th run.Let D = S(U) and D = S (U), and note that D is as in the statement of Lemma 5.4.
The reduction outputs a (kn, 2)-sampler Ŝ whose first coordinate is a sample from D and whose second coordinate is a sample from D .If S ∈ FULLY-INDEPENDENT c, f YES then by Lemma 5.4,

Approximate exchangeability
We now prove Theorem 2.19 and Theorem 2.20.
Proof of Theorem 2.19.We reduce STATISTICAL-DISTANCE c,2 f (which is prSZK-hard by Theorem 5.2) to EXCHANGEABLE c, f .Given a (k, 2)-sampler S and letting D = S(U) where, without loss of generality, D 1 , D 2 are independent, the reduction is the identity map.First we show that if S ∈ STATISTICAL-DISTANCE c,2 f YES then S ∈ EXCHANGEABLE c, f YES .Suppose D 1 − D 2 ≤ c.Let D * be the independent distribution over ({0, 1} k ) 2 both of whose marginals are THEORY OF COMPUTING, Volume 11 (1), 2015, pp.1-34 where the second inequality follows by Now fix some W .Note that since D * is exchangeable, all elements of Ord(W ) have the same probability under D * ; call this probability p * W .If w is an element of Ord(W ) then permuting the coordinates of w uniformly at random yields a uniformly random element of Ord(W ).Thus all elements of Ord(W ) have the same probability under D , namely Proof of Theorem 2.20.For any constant c such that 2c < c < f 2 , we reduce EXCHANGEABLE c, f to STATISTICAL-DISTANCE c , f (which is in prSZK by Theorem 5.3).Given a (k, n)-sampler S, construct a (k, n)-sampler S that performs the following computation.
interpret π as a permutation on {1, . . ., n} run S to get w = (w 1 , . . ., w n ) halt and output w π(1) , . . ., w π(n) end end halt and output the all 0's element of ({0, 1} k ) n We now investigate the structural properties of BC = P.We begin with the following amplification result for BC = P, which is somewhat less trivial than usual amplification results.Then we apply this lemma to obtain closure properties of BC = P. Lemma 6.2.For all languages L, the following are equivalent.
(1) For some polynomial q, there is a Karp reduction that takes x and outputs a (1, 1)-sampler S such that the following both hold.9 x ∈ L =⇒ S(U) is uniform . .
(3) For every polynomial Q, there is a Karp reduction that takes x and outputs a (1, 1)-sampler S such that the following both hold.
Proof.Clearly (3) ⇒ (2) ⇒ (1), so we just need to demonstrate (1) ⇒ (3).Assume (1).By the standard trick for making C = P have "1-sided error" (see Section A.1), it follows that there is a similar reduction mapping x to some (different) ( Hence if x ∈ L then S accepts with probability 1/2.If S accepts with probability then by a standard concentration bound, the probability that ≥ t of m runs of S accept is Also by a standard concentration bound, Hence if x ∈ L then S accepts with probability The reduction for (3) just outputs S .THEORY OF COMPUTING, Volume 11 (1), 2015, pp.Note that coRP ⊆ BC = P.The 1-sided error property in part (3) of Lemma 6.2 implies that BC = P ⊆ BPP.Thus BC = P is presumably closed under complement (since presumably P = BC = P = BPP [23]), but proving this seems out of reach.We now apply Lemma 6.2 to prove that BC = P is closed under union, intersection, disjunction (i.e., ∨L ∈ BC = P if L ∈ BC = P, where ∨L = (x 1 , . . ., x ) : x i ∈ L for some i ), and conjunction (i.e., ∧L ∈ BC = P if L ∈ BC = P, where ∧L = (x 1 , . . ., x ) : x i ∈ L for all i ).The proof of Lemma 3.4 in Section A.1 showing that C = P is closed under disjunction does not work to show that BC = P is closed under disjunction.Theorem 6.3.BC = P is closed under disjunction.
Proof.Assuming L ∈ BC = P, we exhibit a reduction witnessing ∨L ∈ BC = P.Given (x 1 , . . ., x ), by padding we may assume without loss of generality that the x i 's all have the same length n ≥ .By (2) ⇒ (3) in Lemma 6.2, there is a reduction that takes x i and outputs C i such that the following both hold.
Our reduction witnessing ∨L ∈ BC = P runs the above reduction for L on each x i to obtain the circuits C i , then outputs a circuit S that runs each C i independently and combines their results with a parity gate.If (x 1 , . . ., x ) ∈ ∨L then S is taking the parity of independent bits at least one of which is uniform, so S(U) Proof.Assuming L ∈ BC = P, we exhibit a reduction witnessing ∧L ∈ BC = P.We are given (x 1 , . . ., x ).Lemma 6.2 implies that BC = P can have 1-sided error, so there is a reduction that takes x i and outputs C i such that the following both hold.
to drawing balls from an urn with replacement, and when changed to without replacement the problem becomes equivalent to exchangeability, which is decidable in C = P. Are there other interesting structural properties or applications of BC = P?

Theorem 2 . 10 .
All the languages listed in Fact 2.3 are in C = P.

3 .
Given a (1, 1, s)-sampler S, we first find the smallest m such that c • κ(m)ν(m) + s ≤ σ (m).Such an m exists and is O(s) because κν ≤ o(σ ) and σ (m) ≥ m for all m.Then we run the algorithm from Corollary 3.3 (based on Lemma 3.1) with k = κ(m) and n = ν(m) to get T of size at most c • κ(m)ν(m) + s ≤ σ (m).Thus the following both hold.
D h = b h for all h ∈ {1, . . ., n}.To show that D and D are equal as distributions, we just need to show that Pr[E] = Pr[E ].If b i = b j then this certainly holds since E and E are the same event.If b i = b j and Pr[D i = D j ] = 0 then Pr[E] = Pr[E ] = 0. Finally, suppose b i = b j and Pr[D i = D j ] > 0. Assume b i = 1 and b j = 0; the other case is symmetric.Since D i , D j are identically distributed by (1), it follows that Pr[D i = 1, D j = 0] = Pr[D i = 0, D j = 1] = Pr[D i = 1, D j = 0] > 0 .THEORY OF COMPUTING, Volume 11 (1), 2015, pp.1-34 Also note that Pr E D i = 1, D j = 0 = Pr E D i = 1, D j = 0 by (2) and the definition of D .Putting the pieces together, we have

Lemma 5 . 5 . 2 . ( 5 . 1 )
for some distribution D * (not necessarily the same as above) that is exchangeable.In particular, D * 1 , D * 2 are identically distributed.We trivially have D 1 − D * 1 ≤ D − D * and D 2 − D * 2 ≤ D − D * .Thus by the triangle inequality, D 1 − D 2 ≤ D 1 − D * 1 + D 2 − D * 2 < 2 f .The proof of Theorem 2.20 uses the following lemma.Suppose D is a distribution over ({0, 1} k ) n .If D is c-close to some exchangeable distribution D * , then D is 2c-close to the distribution D obtained by drawing a sample from D then permuting the coordinates according to a uniformly random permutation.Proof of Lemma 5.5.For a multiset W ⊆ {0, 1} k of size n, we say that w ∈ ({0, 1} k ) n is an ordering of W if the multiset w i : i ∈ {1, . . ., n} equals W .Let Ord(W ) denote the set of all orderings of W .Let d * + W be the sum of Pr[D = w] − Pr[D * = w] over all w ∈ Ord(W ) such that Pr[D = w] − Pr[D * = w] > 0, and let d * − W be the sum of Pr[D * = w] − Pr[D = w] over all w ∈ Ord(W ) such that Pr[D * = w] − Pr[D = w] > 0. Then by the third equality in Definition 2.14, we have D − D * = ∑ multisets W ⊆ {0, 1} k of size n d * + W + d * − W Letting d + W and d − W be the analogous quantities with D instead of D * , we have Since this holds for all W , we get D − D ≤ 2 • D − D * by (5.1) and (5.2).We mention that the constant factor of 2 in Lemma 5.5 is tight, by the following example.Suppose k = 1, and suppose D is uniformly distributed over a set of n strings, one of which has Hamming weight 1 and the other n − 1 of which have Hamming weight n − 1.Let D * be uniformly distributed over the strings of Hamming weight n − 1.Note that D * is exchangeable, and D − D * = 1/n.However, D has probability 1/n 2 on each string of Hamming weight 1, and probability (n − 1)/n 2 on each string of Hamming weight n − 1, and thus

8 Definition 6 . 1 .
Let D = S(U), let D be as in the statement of Lemma 5.5, and let D = S (U).Conditioned on halting inside the for loop, D has the same distribution as D .In each iteration, there is > 1/2 probability that π < n! and the computation of S halts.Hence the probability the computation halts on the last line (after failing to halt inside the for loop) is < c − 2c.This implies that D − D < c − 2c.The reduction outputs a (kn, 2)-sampler Ŝ whose first coordinate is a sample from D and whose second coordinate is a sample from D .If S ∈ EXCHANGEABLE c, f YES then by Lemma 5.5, D − D ≤ 2c, so by the triangle inequalityD − D ≤ D − D + D − D < 2c + (c − 2c) = c and hence Ŝ ∈ STATISTICAL-DISTANCE c , f YES .If S ∈ EXCHANGEABLE c, f NO then D − D ≥ f since D is exchangeable, and hence Ŝ ∈ STATISTICAL-DISTANCE c , fNO .6BC = PWe now consider the bounded-error version of C = P, which does not seem to have been defined or studied in the literature before.prBC = P is the class of all promise problems Karp-reducible to the following promise problem, BOUNDED-UNIFORM-BIT.BOUNDED-UNIFORM-BITYES = S : S is a (1, 1)-sampler and S(U) is uniform , BOUNDED-UNIFORM-BIT NO = S : S is a (1, 1)-sampler and Pr[S(U) = 1] − (1/2) ≥ 1/4 .BC = P is defined as the class of languages in prBC = P.
1, 1)-sampler S that achieves Pr[S(U) = 1] ≤ 1 2 − 1 q(|x|) 2 in the case x ∈ L. Construct a new circuit S that performs the following computation.let m = 2q(|x|) 4 Q(|x|) and t = 1 2 − 1 2q(|x|) 2 • m choose a uniformly random bit b if b = 0 then run S independently m times, and accept iff ≥ t of these runs accept else accept with probability 1 Note that the values of m, t, and ∑ m i= t m i can be precomputed in polynomial time and hard-wired into S .If S accepts with probability 1/2 then the probability that ≥ t of m runs of S accept is
The reduction's running time is polynomial since m, κ(m), ν(m), σ (m) are all polynomially bounded in s and computable in time polynomial in s, and since the algorithm from Corollary 3.3 runs in time O(kn + n 2 + s).