%% ToC v002/a006 (2006) pp 121--135
%% Ran Raz: Separation of Multilinear Circuit and Formula Size

\documentclass{toc}

\newcommand{\remove}[1]{}

\def\F{{\mathrm F}}
\def\G{{\mathrm G}}

%%\newcommand{\Rank}{\mbox{Rank}}
\DeclareMathOperator{\Rank}{{\mathrm Rank}}

\tocdetails{%
  volume=2, number=6, year=2006, firstpage=121,
  received={September 30, 2005},
  revised={March 4, 2006},
  published={May 2, 2006}
}

\bibliographystyle{tocplain}

\begin{document}

\runningauthor{Ran~Raz}
\runningtitle{Separation of Multilinear Circuit and Formula Size}

\copyrightauthor{Ran Raz}

\begin{frontmatter}[classification=text]
\title{Separation of Multilinear Circuit and Formula Size}

\author[ran]{Ran Raz\thanks{Research supported by the Israel Science
    Foundation (ISF) and by the Minerva Foundation.}}
%
%  {\tt ran.raz@weizmann.ac.il} }

\tocams{68Q15, 68Q17, 68Q25, 94C05}

\tocacm{F.2.2, F.1.3, F.1.2, G.2.0}

\tockeywords{computational complexity, separation of complexity classes,
       lower bounds, circuit complexity, arithmetic circuits,
       arithmetic formulas, log-depth, multilinear polynomials}

\tocpdftitle{Separation of Multilinear Circuit and Formula Size}
\tocpdfauthor{Ran Raz}

\begin{abstract}
  An arithmetic circuit or formula is multilinear
  if the polynomial computed at each of its wires is multilinear.
  We give an explicit polynomial $f(x_1,\dots,x_n)$
  with coefficients in $\{0,1\}$ such that over any field:\\
  \parbox{\textwidth}{
  \begin{enumerate}
   \item $f$ can be computed by a polynomial-size multilinear circuit
    of depth $O(\log^2 n)$.
  \item Any multilinear formula for $f$ is of size
    $n^{\Omega(\log n)}$.
  \end{enumerate}}
  \noindent
  This gives a superpolynomial gap between multilinear circuit
  and formula size, and separates multilinear $NC_1$ circuits
  from multilinear $NC_2$ circuits.
\end{abstract}

\end{frontmatter}

\section{Introduction}

An outstanding open problem in arithmetic
circuit complexity is to understand the relative power of
circuits and formulas. Surprisingly, any arithmetic
circuit of size $s$ for a polynomial of degree $d$ can be
translated into an arithmetic formula of size quasi-polynomial
in $s$ and~$d$~\cite{H,VSBR}.\footnote{Moreover,
if $s,d$ are both polynomial in the number of input variables $n$,
then the circuit can be translated into a polynomial-size
circuit of depth $O(\log^2n)$, that is, an
$NC_2$ circuit~\cite{VSBR}.}
Can such a circuit be translated
into a formula of size {\it polynomial} in $s$ and~$d$~?

In this paper, we answer this question for
{\it multilinear} circuits and formulas.
An arithmetic circuit (or formula) is {\it multilinear}
if the polynomial computed at each of its
wires is multilinear (as a formal polynomial), that is, in
each of its monomials the exponent  
of every input variable is at most one.

A preliminary version of this
paper appeared in FOCS 2004. The title was
``Multilinear-$NC_1$ $\neq$ Multilinear-$NC_2$''~\cite{R2}.


\subsection{Multilinear Circuits} \label{s:def}

Let $\F$ be a field, and let $\{x_1,\dots,x_n\}$
be a set of input variables.
An {\it arithmetic circuit} is a directed acyclic graph with
nodes of in-degree 0 or 2.
We refer to the in-neighbors of a node as its ``children.''
Every {\it leaf} of the graph (i.\,e., a node of in-degree 0)
is labelled with either an input variable or a field element.
Every other node of the graph is labelled with either $+$ or $\times$
(in the first case the node is a {\it sum gate}
and in the second case a {\it product gate}).
We assume that
there is only one node of out-degree zero, called the {\it root}.
The circuit is a {\it formula} if its underlying graph
is a (binary) tree (with edges directed from the leaves to the root).

An arithmetic circuit computes a polynomial in the 
ring $\F[x_1,\dots,x_n]$ in the following way.
A leaf just computes the input variable or field element
that labels it.   A sum gate  computes the sum of the two 
polynomials computed by its children.
A product gate computes the product of the two polynomials
computed by its children.   The {\it output} of the circuit is the
polynomial computed by the root.   For a circuit $\Phi$,
we denote by $\hat{\Phi}$ the output of the circuit,
that is, the polynomial computed by the circuit.
The {\it size} of a circuit $\Phi$ is defined to be the number of
nodes in the graph, and is denoted by $|\Phi|$.
The {\it depth} of a circuit is defined to be the maximal distance
between the root and a leaf in the graph.

A polynomial in the ring $\F[x_1,\dots,x_n]$ is {\it multilinear} if
in each of its monomials the exponent 
of every input variable is at most one.
An arithmetic circuit (or formula) is {\it multilinear}
if the polynomial computed by each gate of the circuit is multilinear.


\subsection{Background}

Multilinear circuits (and formulas) were formally defined
by Nisan and Wigderson in~\cite{NW}.
Obviously, multilinear circuits can only compute multilinear
functions.
Moreover, multilinear circuits are restricted, as they do not allow the
intermediate use of higher powers of variables in order to finally
compute a certain multilinear function.
Note, however, that for many multilinear functions, circuits that are not
multilinear are very counter-intuitive, as they require
a ``magical'' cancellation of all high powers of variables.
For many multilinear functions, it seems ``obvious'' that the smallest
circuits and formulas should be multilinear.
Moreover, for most multilinear functions, no gain is known
to come from permitting higher powers.

For example, the (first entry of the)
product of $n$ matrices of size $n \times n$
is a multilinear function, and
the smallest known circuits for this function are multilinear.
It seems intuitively clear that the smallest circuits for this function
should be multilinear.
On the other hand, for some multilinear functions
the smallest known circuits are not multilinear.
For example, the determinant of an $n\times n$ matrix
is a multilinear function (of the $n^2$ entries)
that has polynomial size
arithmetic circuits but doesn't have known subexponential
size multilinear circuits.

Super-polynomial lower bounds for the size of multilinear formulas
were recently proved~\cite{R1}.
In particular, it was proved that
over any field, any multilinear
formula for the permanent or the determinant of an 
$n \times n$ matrix is of size $n^{\Omega(\log n)}$.
Note, however, that all known multilinear circuits for the permanent
or the determinant are of exponential size, and hence these bounds
don't give any separation between multilinear circuit and formula
size.

For more background and motivation for the study of multilinear
circuits and formulas see~\cite{NW,R1,A}.
For general background on algebraic complexity theory
see~\cite{G,BCS}.


\subsection{Our results}

We construct an explicit polynomial with the properties
specified in the following theorem.

\begin{theorem} \label{ttt1}
There exists an explicit multilinear
polynomial $f(x_1,\dots,x_n)$,
with coefficients in $\{0,1\}$, such that over any field:
\begin{itemize}
\item[(a)] $f$ can be computed by a polynomial-size multilinear circuit
of depth $O(\log^2 n)$;
\item[(b)] any multilinear formula for $f$ is of size
$n^{\Omega(\log n)}$.
\end{itemize}
\end{theorem}

Item (a) means that $f$ has a multilinear $NC_2$ circuit.
Item (b) implies that $f$ cannot be computed by a
polynomial-size multilinear circuit of depth $O(\log n)$, that is,
by a multilinear $NC_1$ circuit.\footnote{Note that
any (multilinear) circuit
of depth $O(\log n)$ can trivially be translated into a
polynomial size (multilinear) formula (of depth $O(\log n)$).}
This gives a super-polynomial gap between multilinear circuit
and formula size, and separates multilinear $NC_1$ circuits
from multilinear $NC_2$ circuits.

For the proof of our lower bound on the multilinear formula size of
$f$, we use methods from~\cite{R1}.
The main contribution of this paper is
the construction of a polynomial $f$ that can be computed
by small multilinear circuits, and to which these methods
can be applied.
%In particular, this shows that the proof method of~\cite{R1}
%is not strong enough to prove circuit lower bounds.


\section{Syntactic multilinear formulas}

Let $\Phi$ be an arithmetic circuit over the set of variables
$\{x_1,\dots,x_n\}$.
For every node $v$ in the circuit, denote by $\Phi_v$ the sub-circuit
with root $v$, and denote by
$X_v$ the set of variables
that occur in the circuit $\Phi_v$.
We say that an arithmetic circuit $\Phi$ is {\it syntactic multilinear}
if for every product gate $v$ of $\Phi$, with children $v_1,v_2$, the sets
of variables $X_{v_1}$ and $X_{v_2}$ are disjoint.

Note that any syntactic multilinear circuit is clearly multilinear.
At the other hand, a multilinear circuit is not necessarily syntactic
multilinear. Nevertheless, the following proposition shows that
without loss of generality we can assume that a multilinear formula
is syntactic multilinear.

\begin{proposition}[\cite{R1}] \label{p1} 
For any multilinear formula, there exists a
syntactic multilinear formula of the same size that computes the
same polynomial.
\end{proposition}
\begin{proof}
Let $\Phi$ be a multilinear formula. Let $v$ be a product gate in $\Phi$,
with children $v_1,v_2$, and assume that $X_{v_1}$ and $X_{v_2}$ both contain
the same variable $x_i$. Since $\Phi$ is multilinear, $\hat{\Phi}_v$ is
a multilinear polynomial and hence in at least one of the polynomials
$\hat{\Phi}_{v_1},\hat{\Phi}_{v_2}$ the variable $x_i$ doesn't occur.
W.l.o.g. assume that in the polynomial $\hat{\Phi}_{v_1}$ the variable
$x_i$ doesn't occur.
Then every occurrence of $x_i$ in $\Phi_{v_1}$ can be replaced
by the constant 0.
By repeating this for every product gate in the formula, as many times as
needed, we obtain a syntactic multilinear formula that computes the
same polynomial.
\end{proof}


\section{Lower bounds for multilinear formulas}

In this section, we prove general lower bounds for the size
of multiliear formulas. To prove these bounds we follow very closely the
techniques from~\cite{R1}.
As in~\cite{R1}, our starting point is
the partial derivatives method
of Nisan and Wigderson~\cite{N,NW}.   As in~\cite{R1},
to handle sets of partial derivatives, we make use of the
{\em partial derivatives matrix}  (first used in~\cite{N}).


\subsection{The partial-derivatives matrix} \label{mat}

Let $f$ be a multilinear polynomial over the
set of variables
$\{y_1,\dots,y_m\}\; \dot\cup \; \{z_1,\dots,z_m\}$.
For a multilinear monomial $p$ in the set of variables
$\{ y_1,\dots,y_m \}$ and a multilinear monomial $q$
in the set of variables $\{ z_1,\dots,z_m \}$,
denote by $M_f(p,q)$ the
coefficient of the monomial $pq$ in the polynomial $f$.
Since the number of multilinear monomials
in a set of $m$ variables\footnote{
We only consider monomials with coefficient 1
(such as $x_1x_3x_4$, as
          opposed to, say, $3x_1x_3x_4$).}
is $2^m$, we can think of $M_f$ as a
$2^m \times 2^m$ matrix, with entries in the field $\F$.
We call $M_f$ the {\em partial derivatives matrix} of $f$.
We will be interested in the rank of the matrix $M_f$ over the
field $\F$.

The following two propositions give some basic facts about the
partial derivatives matrix.

\begin{proposition} \label{p3}
  \begin{sloppypar}
    Let $f,f_1,f_2$ be three multilinear
    polynomials over the set of variables $\{y_1,\dots,y_m\}
    \;\dot\cup\; \{z_1,\dots,z_m\}$, such that $f=f_1+f_2$. Then $M_f =
    M_{f_1} + M_{f_2}.$
  \end{sloppypar}
\end{proposition}
\begin{proof}
Immediate from the definition of the partial derivatives matrix.
\end{proof}

\begin{proposition} \label{p4}
\begin{sloppypar}
Let $f,f_1,f_2$ be three multilinear polynomials over
the set of variables 
$\{y_1,\dots,y_m\} \; \dot\cup\;  \{z_1,\dots,z_m\}$,
such that $f=f_1 \cdot f_2$, and such that the set of variables that
$f_1$ depends on and the set of variables that $f_2$ depends on are
disjoint. Then, $\Rank(M_f) = \Rank(M_{f_1}) \cdot \Rank(M_{f_2}).$
\end{sloppypar}
\end{proposition}
\begin{proof}
Note that the matrix $M_f$ is the tensor product of $M_{f_1}$
and $M_{f_2}$
(where all matrices are restricted to rows and columns
that are non-zero).
Hence, the rank of $M_f$ is the product of the rank of $M_{f_1}$
and the rank of $M_{f_2}$.
\end{proof}

Let $\Phi$ be a multilinear formula over
the set of variables $\{y_1,\dots,y_m\} \;\dot\cup\;  \{z_1,\dots,z_m\}$.
Recall that the output $\hat{\Phi}$ of the formula $\Phi$ is
a multilinear polynomial over $\{y_1,\dots,y_m\} \;\dot\cup\;  \{z_1,\dots,z_m\}$.
For simplicity, we denote the matrix $M_{\hat{\Phi}}$ also by
$M_\Phi$.
We will be interested in bounding the rank of the matrix
$M_\Phi$ over the
field $\F$.
(Note, however, that the rank of $M_\Phi$ may be as large as $2^m$
(i.\,e., full rank),
even if the formula $\Phi$ is of linear size.)

%For a node $v$ in the formula,
%we denote the matrix $M_{\Phi_v}$ also by $M_v$.
%We will be interested in bounding the rank of the matrix $M_v$ over the
%field $\F$.
%(Note, however, that the rank of $M_v$ may be as large as $2^m$
%(i.\,e., full rank),
%even if the formula $\Phi$ is of linear size).


\subsection{Unbalanced nodes} \label{Sunb}

Let $\Phi$ be a syntactic multilinear formula over
the set of variables $\{y_1,\dots,y_m\} \;\dot\cup\;  \{z_1,\dots,z_m\}$.
For every node $v$ in the formula,
denote by
$Y_v$ the set of variables in $\{y_1,\dots,y_m\}$
that occur in the formula $\Phi_v$, and denote by
$Z_v$ the set of variables in $\{z_1,\dots,z_m\}$
that occur in the formula $\Phi_v$.

Denote by $b(v)$ the average of $|Y_v|$ and $|Z_v|$ and denote by
$a(v)$ their minimum.  Let  $d(v) = b(v) - a(v)$.
We say that a node $v$ is {\it $k$-unbalanced} if $d(v) \geq k$.

Let $\gamma$ be a simple path from a leaf $w$ to a node $v$
of the formula $\Phi$.
We say that $\gamma$ is {\it $k$-unbalanced}
if it contains at least one $k$-unbalanced node.
We say that $\gamma$ is {\it central} if for every $u,u_1$ on the path
$\gamma$ such that $u_1$ is a child of $u$,
we have $b(u) \leq 2 b(u_1)$.
Note that for every node $u$ in the formula,
with children $u_1,u_2$, we have
$b(u) \leq b(u_1) + b(u_2)$. Hence, by induction,
for every node $u$ in the formula,
there exists at least one central path that reaches $u$.
In particular, at least one central path reaches the root.

We say that the formula $\Phi$ is {\it $k$-weak} if
every central path
that reaches the root of the formula
contains at least one $k$-unbalanced node.
The following lemma from~\cite{R1} shows
that if the formula $\Phi$ is $k$-weak then
the rank of the matrix $M_\Phi$ can be bounded.

%We say that a node $v$ of the formula is {\it $k$-weak} if
%every central path that reaches $v$ is $k$-unbalanced.
%We say that the formula $\Phi$ is {\it $k$-weak} if the root
%of the formula is $k$-weak, that is, every central path
%that reaches the root contains at least one $k$-unbalanced node.
%The following lemma from~\cite{R1} shows
%that if a node $v$ is $k$-weak then
%the rank of the matrix $M_v$ can be bounded.

\begin{lemma}[\cite{R1}]  \label{l1}
\begin{sloppypar}
Let $\Phi$ be a syntactic multilinear formula over
the set of variables $\{y_1,\dots,y_m\} \;\dot\cup\;  \{z_1,\dots,z_m\}$,
and assume that $\Phi$ is $k$-weak.
Then,
\[
\Rank(M_\Phi) \leq |\Phi| \cdot 2^{m-k/2}\enspace.
\]
\end{sloppypar}
\end{lemma}


\subsection{Random partition} \label{rand}

Let $n=2m$.
Let $\Phi$ be a syntactic multilinear formula over
the set of variables
$X = \{x_1,\dots,x_n\}$.
Let $A$ be a random partition of the variables in $X$ into
$\{y_1,\dots,y_m\} \;\dot\cup\;  \{z_1,\dots,z_m\}$.
Formally, $A$ is
a (randomly chosen) one-to-one function from the set of variables
$X$ to the set of variables
$\{y_1,\dots,y_m\} \;\dot\cup\;  \{z_1,\dots,z_m\}$.

Denote by $\Phi_A$ the formula $\Phi$ after replacing
every variable of $X$ by the
variable assigned to it by $A$.
Obviously,
$\Phi_A$ is a syntactic multilinear formula over
the set of variables
$\{y_1,\dots,y_m\} \;\dot\cup\;  \{z_1,\dots,z_m\}$.

The following lemma shows that if $|\Phi|$ is small
then with high probability $\Phi_A$ is $k$-weak for $k=n^{1/8}$.
We will give the proof of the lemma in the next section.

\begin{lemma} \label{l2}
Let $n=2m$.
Let $\Phi$ be a syntactic multilinear formula over
the set of variables
$X = \{x_1,\dots,x_n\}$, such that every variable in $X$
occurs in $\Phi$, and such that
$|\Phi| \leq n^{\epsilon \log n},$
where $\epsilon$ is a sufficiently small universal constant
(e.g., $\epsilon = 10^{-6}$).
Let $A$ be a random partition of the variables in $X$ into
$\{y_1,\dots,y_m\} \;\dot\cup\;  \{z_1,\dots,z_m\}$.
Then, with probability of at least $1 - n^{-\Omega(\log n)}$
the formula $\Phi_A$ is $k$-weak, for
$k=n^{1/8}$.
\end{lemma}


\subsection{The lower bounds}

Lower bounds for the size of multilinear formulas can
be proved as a corollary to \lemref{l1} and
\lemref{l2}. We will prove lower bounds for functions that
satisfy the following {\it high rank} property.\footnote{Note
that the functions $f$ used in this paper
will actually satisfy a much stronger property. Namely,
for any partition $A$,
we will have $\Rank(M_{f_A}) = 2^{m}$
(where all notation is as in \defref{d1}).}

\begin{definition}[High rank]  \label{d1}
Let $n=2m$. Let $f$ be a multilinear polynomial
(over a field~$\F$) over the
set of variables $X = \{x_1,\dots,x_n\}$.
We say that $f$ is of\/ {\bf high rank} over $\F$
if the following is satisfied:
Let $A$ be a random partition of the variables in $X$ into
$\{y_1,\dots,y_m\} \;\dot\cup\;  \{z_1,\dots,z_m\}$.
Then, with probability of at least
$n^{-o(\log n)}$,
\[
\Rank(M_{f_A}) \geq 2^{m- m^{1/8}/2}\enspace,
\]
where the rank is over $\F$, and
$f_A$ denotes the polynomial $f$ after replacing
every variable in $X$ by the variable assigned to it by $A$.
\end{definition}

The following corollary is our basic lower bound.

\begin{corollary} \label{c1}
Let $n=2m$.
Let $f$ be a multilinear polynomial
(over a field $\F$)
over the
set of variables $X = \{x_1,\dots,x_n\}$.
If $f$ is of high rank over $\F$ (see \defref{d1})
then for any  multilinear formula  $\Phi$ for~$f$,
\[
|\Phi| \geq  n^{\Omega(\log n)}\enspace.
\]
\end{corollary}

\begin{proof}
By \propref{p1}, we can assume w.l.o.g. that
$\Phi$ is syntactic multilinear.
Note also that we can assume w.l.o.g.
that all the variables in $X$ occur in $\Phi$, as we can always
add variables multiplied by 0.
Assume for a contradiction that
$|\Phi| \leq n^{\epsilon \log n},$
where $\epsilon$ is the universal constant
from \lemref{l2}.
Let $A$ be a random partition of the variables in $X$ into
$\{y_1,\dots,y_m\} \;\dot\cup\;  \{z_1,\dots,z_m\}$.
Then, by \lemref{l2},
with probability of at least $1 - n^{-\Omega(\log n)}$
the formula $\Phi_A$ is $k$-weak, for
$k=n^{1/8}$.

Hence, by \lemref{l1},
with probability of at least $1 - n^{-\Omega(\log n)}$,
$$
\Rank(M_{\Phi_A}) < 2^{m - m^{1/8}/2}\enspace.
$$
Thus $\Phi$ cannot be a formula for the high rank function $f$.
\end{proof}

We will now consider multilinear polynomials $f$
(over a field $\F$)
over two sets of variables:
$X = \{x_1,\dots,x_n\}$ and $X' = \{x'_1,\dots,x'_l\}$.
We think of the variables in $X'$ as auxiliary variables.
Let $A': X' \to \F$ be an assignment of values in $\F$
to all the variables in $X'$.
We denote by $f_{A'}$ the polynomial $f$, after substituting
in every variable in $X'$ the value assigned to it by $A'$.
Note that $f_{A'}$ is a multilinear polynomial
over the set of variables $X$.

\begin{corollary} \label{c2}
Let $n=2m$.
Let $f$ be a multilinear polynomial
(over a field $\F$)
over the
sets of variables $X = \{x_1,\dots,x_n\}$ and
$X' = \{x'_1,\dots,x'_l\}$.
If for some assignment $A': X' \to \F$ the polynomial
$f_{A'}$ is of high rank over $\F$ (see \defref{d1})
then for any  multilinear formula  $\Phi$ for~$f$,
\[
|\Phi| \geq  n^{\Omega(\log n)}\enspace.
\]
\end{corollary}

\begin{proof}
Denote by $\Phi_{A'}$ the formula $\Phi$ after replacing
every variable of $X'$ by the
value assigned to it by $A'$. Then, $\Phi_{A'}$ is a formula
for $f_{A'}$, and $|\Phi_{A'}|  =|\Phi|$.
Hence,
by \corref{c1},
$|\Phi| = |\Phi_{A'}| \geq n^{\Omega(\log n)}.$
\end{proof}

In some cases, in order to find an assignment $A'$
such that the polynomial $f_{A'}$ is of high rank, we will need
to consider extensions $\G$ of the field $\F$. Note that
any polynomial $f$ over $\F$ is also a polynomial over any
field extending $\F$.

\begin{corollary} \label{c3}
Let $n=2m$.
Let $f$ be a multilinear polynomial
(over a field $\F$)
over the
sets of variables $X = \{x_1,\dots,x_n\}$ and
$X' = \{x'_1,\dots,x'_l\}$.
If for some field $\G \supset \F$ there exists an
assignment $A': X' \to \G$, such that the polynomial
$f_{A'}$ is of high rank over $\G$ (see \defref{d1})
then for any  multilinear formula  $\Phi$ for~$f$
(over the field $\F$),
\[
|\Phi| \geq  n^{\Omega(\log n)}\enspace.
\]
\end{corollary}

\begin{proof}
Any multilinear formula for $f$ over the field~$\F$ is also
a multilinear formula for $f$ over the field~$\G$.
The proof hence follows by \corref{c2}.
\end{proof}




\section{Proof of \lemref{l2}}


Let us first give a brief sketch of the proof.
Note that the intuition and the basic structure
of the proof are the same as in~\cite{R1},
but the details here are much simpler.

Intuitively, since $A$ is random,
every node $v$ with large enough $X_v$ will be $k$-unbalanced
with high probability.
The probability that such $v$ is not $k$-unbalanced
is smaller than $O(n^{-\delta})$, for some constant $\delta$.
This may not be enough since the number of central paths
is possibly as large as $n^{\epsilon \log n}$.
Nevertheless, each central path contains $\Omega(\log n)$ nodes
so we can hope to prove that the probability that none of them
is $k$-unbalanced is as small as $n^{-\Omega(\log n)}$.

This, however, is not trivial since there are dependencies
between the different nodes. We will
identify $\Omega(\log n)$
nodes, $v_1,\dots,v_l$, on the path (that will be ``far enough''
from each other).
We will show that for every $v_i$,
the probability that $v_i$ is not $k$-unbalanced
is smaller than $O(n^{-\delta})$, even when conditioning on the event
that $v_1,\dots,v_{i-1}$ are not $k$-unbalanced.



\subsection{Notation}

For any integer $n$,
denote $[n] = \{1,\dots,n\}$.

To simplify notation, we denote in this section
the formula $\Phi_A$ by $\Psi$.
There is a one-to-one correspondence between the nodes of $\Phi$
and the nodes of $\Psi$.
For every node $v$ in $\Phi$, there is a corresponding
node in $\Psi$ and vice versa. For simplicity, we denote both these nodes
by~$v$, and we think of them as the same node.
Hence, $X_v$ denotes the set of variables in $X$
that occur in the formula $\Phi_v$, while
$Y_v$ denotes the set of variables in $\{y_1,\dots,y_m\}$
that occur in the formula $\Psi_v$, and
$Z_v$ denotes the set of variables in $\{z_1,\dots,z_m\}$
that occur in $\Psi_v$.   Let   
$$
\alpha(v) = |X_v|/n\enspace.
$$

For three integers $M_1,M_2 \leq N$,
denote by ${\cal H}(N,M_1,M_2)$ the hypergeometric distribution with
parameters $N,M_1,M_2$, that is,
the distribution of the size
of the intersection of a random set of size $M_2$ and a set of
size $M_1$ in a universe of size $N$.

\begin{proposition} \label{ph}
Let $\chi$ be a random variable that has the hypergeometric
distribution ${\cal H}(N,M_1,M_2)$,  where
 $N/4 \leq M_2 \leq 3N/4$, and
$N^{1/2} \leq M_1 \leq N/2$.
Then, $\chi$ takes
any specific value with probability of at most $O(N^{-1/4})$.
That is, for any number $a$,
\[
\Pr [ \chi = a] \leq O(N^{-1/4})\enspace.
\]
\end{proposition}
\begin{proof}
Follows by the definition of
the hypergeometric distribution
and standard bounds on binomial coefficients.
\end{proof}


\subsection{Central paths are unbalanced}


Let $\gamma$ be a simple path from a leaf to a node
in $\Phi$.
Note that $\gamma$ is central in $\Psi$
iff for every $u,u_1$ on the path
$\gamma$, such that $u_1$ is a  child of $u$,
we have $\alpha(u) \leq 2 \alpha(u_1)$. Since this property doesn't
depend on the partition $A$, we say in this case that
$\gamma$ is central in $\Phi$.
We will show that if $\gamma$ is central
then with high probability $\gamma$ is unbalanced
in the formula~$\Psi$.

\begin{claim} \label{unb}
Let $\gamma$ be a central
path from a leaf to the root of $\Phi$.
Then,
\[
\Pr[\gamma \mbox{ is not } k \mbox{-unbalanced in }
\Psi \mbox{  } ] \leq
n^{- \Omega( \log n)}\enspace.
\]
\end{claim}

\begin{proof}
Recall that the first node of $\gamma$ is a leaf and hence
$\alpha(v)$ for that node is at most $1/n$, and the last node
of $\gamma$ is the root and hence
$\alpha(v)$ for that node is 1.
Note that $\alpha(v)$ is monotonously increasing along $\gamma$.
Let $v_1,\dots,v_l$ be nodes on $\gamma$, chosen by the following process:
Let $v_1$ be the first node on $\gamma$, such that
$\alpha(v_1) \geq n^{-1/2}.$
For every $i$, let $v_{i+1}$ be the first node on
$\gamma$, such that
$\alpha(v_{i+1}) \geq 2 \cdot \alpha(v_{i})$.
Stop when
%such $v_{i+1}$ doesn't exist or
$\alpha(v_{i+1}) > 1/4$.
Denote by $l$ the index~$i$ of the last $v_i$ in this process.

Since $\gamma$ is central, for every $u,u'$ on $\gamma$, such that
$u'$ is a  child of $u$,  we have $\alpha(u) \leq 2 \alpha(u')$.
Hence, for every $i \in [l-1]$,
we have $\alpha(v_{i+1}) < 4 \cdot \alpha(v_{i})$.
Hence, the process above continues for $\Omega(\log n)$ steps. To
summarize, we have $l = \Omega(\log n)$ and nodes $v_1,\dots,v_l$ on
$\gamma$, such that
for every $i \in \{2,\dots,l\}$,
$$
1/4 \geq \alpha(v_i) \geq 2 \cdot \alpha(v_{i-1})
\geq  n^{-1/2}\enspace.
$$


Denote by ${\cal E}$ the event that $\gamma$ is
not $k$-unbalanced in the formula $\Psi$.
For every $i \in [l]$, denote by ${\cal E}_i$
the event that the node $v_i$ is not $k$-unbalanced in the
formula $\Psi$. Since ${\cal E} \subset
\cap_{i \in [l]} {\cal E}_i$,
$$\Pr[{\cal E}] \leq
\Pr \left[ \bigcap_{i \in [l]} {\cal E}_i \right]
 = \prod_{i \in [l]}
\Pr \left[ {\cal E}_i \;
\left| \;  \bigcap_{i' \in [i-1]} {\cal E}_{i'}  \right] \right.  \enspace.$$
We will bound for every $i>1$ the conditional probability
$\Pr [ {\cal E}_i \;
| \;  \cap_{i' \in [i-1]} {\cal E}_{i'}  ]$.

Fix $i \in \{2,\dots,l\}$. Note that
$X_{v_{i-1}} \subset X_{v_i}$.
Given the
set $Y_{v_{i-1}}$, we can write,
$$|Y_{v_i}| = |Y_{v_{i-1}}| + \chi\enspace,$$ where $\chi$ has the
distribution ${\cal H}(N,M_1,M_2)$, with
$N = n - |X_{v_{i-1}}|$, $M_1 = |X_{v_i}| -|X_{v_{i-1}}|$,
$M_2 = m - |Y_{v_{i-1}}|$.

Hence, by \propref{ph},
$|Y_{v_i}|$ does not take any specific value with probability larger
than $O(n^{-1/4})$,
even when conditioning
on (the content of) the set $Y_{v_{i-1}}$.

Note that the event
$\bigcap_{i' \in [i-1]} {\cal E}_{i'}$ depends only on
the content of the set $Y_{v_{i-1}}$.
Therefore, $|Y_{v_i}|$,
and hence also $d(v_i)$,
do not take any specific value with probability larger
than $O(n^{-1/4})$,
even when conditioning
on the event $\bigcap_{i' \in [i-1]} {\cal E}_{i'}$.
Recall that $v_{i}$ is not $k$-unbalanced iff $d(v_i) < k$.
Since $d(v_i)$ is integer, the probability for that is
at most $O(k \cdot n^{-1/4}) = O(n^{-1/8})$,
even when conditioning
on the event $\bigcap_{i' \in [i-1]} {\cal E}_{i'}$.
That is,
$$\Pr \left[ {\cal E}_i \; \left| \;
\bigcap_{i' \in [i-1]} {\cal E}_{i'}  \right]
\leq O(n^{-1/8}) \right. \enspace.$$

We can now bound
$$\Pr[{\cal E}] \leq
\prod_{i \in [l]}
\Pr \left[ {\cal E}_i \;
\left| \;  \bigcap_{i' \in [i-1]} {\cal E}_{i'}  \right] =
n^{- \Omega( \log n)}
 \right . \enspace.$$
\end{proof}

We can now complete the proof of \lemref{l2}.
By \claimref{unb},
if  $\gamma$ is a central path from a leaf to
the root of $\Phi$, then $\gamma$ is not $k$-unbalanced
(in $\Psi$)
with probability of at most $n^{- \Omega( \log n)}$.
The number of paths from a leaf to the root
of $\Phi$ is the same as the
number of leaves in $\Phi$, which is smaller than
$n^{\epsilon \log n}$ (and we assumed that $\epsilon$ is small enough).
Hence, by the union bound, with probability
of at least $1 - n^{-\Omega(\log n)}$
all central paths from a leaf to the root of $\Psi$ are
$k$-unbalanced, that is,
the formula $\Psi$ is $k$-weak.


\section{Multilinear-$NC_1$ $\neq$ Multilinear-$NC_2$}

In this section, we present our construction for a multilinear
polynomial
$f$ that has polynomial-size multilinear circuits and
doesn't have polynomial-size multilinear formulas.
Let us start with some notation and
concepts needed to define $f$.

Let  $[n] = \{1,\dots,n\}$.
For every $i,j \in [n]$ such that $i \leq j$,
denote by $[i,j]$ the interval of $[n]$
starting at $i$ and ending at $j$, that is,
$[i,j] = \{i,i+1,\dots,j\}$.
Denote by ${\cal S}$ the set of all such intervals, including the empty
interval (which is denoted by $\emptyset$).
For $s_1, s_2 \in {\cal S}$, such that $s_1, s_2$
are disjoint and $s_2$ is
consecutive\footnote{We think of the empty interval as consecutive
to every interval, and every interval is consecutive to it.}
to $s_1$,
denote by $s_1 \circ s_2$ their concatenation,
that is, if $s_1 = [i,j]$, and $s_2 =[j+1,j']$ then
$s_1 \circ s_2 = [i,j']$.

Denote by ${\cal T}$ the set of (ordered)
pairs of disjoint intervals in ${\cal S}$,
that is,
$${\cal T} = \{(s_1,s_2) \in {\cal S} \times {\cal S} \; :
\; s_1 \cap s_2 = \emptyset \}\enspace.$$
For $t_1,t_2 \in {\cal T}$, such that,
$t_1=(s_{1,1},s_{1,2}), t_2=(s_{2,1},s_{2,2})$, and
such that $s_{1,1},s_{1,2}, s_{2,1},s_{2,2}$ are all disjoint
and  $s_{2,1}$ is consecutive to $s_{1,1}$
and  $s_{2,2}$ is consecutive to $s_{1,2}$,
denote by $t_1 \circ t_2$ their pairwise concatenation,
that is,
$t_1 \circ t_2 = (s_{1,1} \circ s_{2,1}, s_{1,2} \circ s_{2,2})
\in {\cal T}$.

For every $s \in {\cal S}$, denote by $l(s)$ its length
(i.\,e., the number of elements in it).
For $t=(s_1,s_2) \in {\cal T}$, denote
$l(t) = l(s_1) + l(s_2)$.
For $t=(s_1,s_2) \in {\cal T}$, define
$L(t) = l(t)$
if both $s_1, s_2$ are non-empty, and
$L(t) = 0.75 \cdot l(t)$ if either
$s_1$ or $s_2$ is empty.
(We will use $L(t)$ as a measure for the ``size'' of $t$.
For technical reasons we want $t$ to be considered smaller
if either
$s_1$ or $s_2$ is empty).

For $t_1,t_2,t_3,t \in {\cal T}$, such that,
$t_1=(s_{1,1},s_{1,2}), t_2=(s_{2,1},s_{2,2}),
 t_3=(s_{3,1},s_{3,2}), t=(s_1,s_2)$,
we say that $\{t_1,t_2\}$ is a {\it partition} of $t$
if $\{s_{1,1},s_{1,2}, s_{2,1},s_{2,2} \}$ is a partition
of $s_1 \cup s_2$ as sets.
We say that the partition is {\it proper} if
$t = t_1 \circ t_2$, and $l(t_1),l(t_2) > 0$.
In the same way,
$\{t_1,t_2,t_3\}$ is a partition of $t$
if $\{s_{1,1},s_{1,2}, s_{2,1},s_{2,2}, s_{3,1},s_{3,2}\}$
is a partition of $s_1 \cup s_2$ as sets.
The partition is proper if
$t = t_1 \circ t_2 \circ t_3$, and $l(t_1),l(t_2),l(t_3) > 0$.

For a function $A: [n] \to \{1,-1\}$ and for $s \in {\cal S}$,
denote by $A(s)$ the sum of $A$ on the elements in $s$.
In the same way,
for $t \in {\cal T}$, denote by $A(t)$ the sum of $A$ on
the elements in the union of the two intervals in $t$.
We say that
$A$ is balanced on $s \in {\cal S}$ if $A(s) =0$, and
in the same way,
$A$ is balanced on $t \in {\cal T}$
if $A(t) =0$.
Denote by ${\cal B}_A$ the set of all $t \in {\cal T}$
on which $A$ is balanced, that is,
$${\cal B}_A = \{ t \in {\cal T} \; : \; A(t)=0 \}\enspace.$$
Obviously, the length $l(t)$ of every $t \in {\cal B}_A$
is even.

For our proof, we will need the following technical lemma.
Roughly speaking, the lemma states that any $t \in {\cal B}_A$
can be partitioned into three significantly smaller
$t_1,t_2,t_3 \in {\cal B}_A$.
We defer the proof of the lemma to \secref{sssl}.

\begin{lemma} \label{l:part}
Let $A$ be a function $A: [n] \to \{1,-1\}$.
Let $t \in {\cal B}_A$ be such that $l(t) > 2$.
Then, there exist
$t_1,t_2,t_3 \in {\cal B}_A$, such that
$\{t_1,t_2,t_3\}$
is a partition of $t$, and
$L(t_1), L(t_2), L(t_3) \leq 0.75 \cdot L(t)$.
\end{lemma}

For any $t \in {\cal T}$, such that $l(t)$ is even,
denote by ${\cal P}(t)$ the set of all
$\{t_1,t_2,t_3\}$, such that: $t_1,t_2,t_3 \in {\cal T}$, and
$\{t_1,t_2,t_3\}$
is a partition of $t$, and
$l(t_1),l(t_2),l(t_3)$ are even, and
$L(t_1), L(t_2), L(t_3) \leq 0.75 \cdot L(t)$.



\subsection{The construction}

We will now define our multilinear polynomial $f$
(with coefficients in $\{0,1\}$), such that over any field,
$f$ can be computed by a polynomial-size multilinear circuit
and cannot be computed by a polynomial-size multilinear formula.
$f$ will be defined
over the set of variables $X = \{x_1,\dots,x_n\}$
(where $n$ is even)
and
a set of (auxiliary) variables
$$X' = \left\{ x'_{t,t_1,t_2,t_3} \right\}_{t,t_1,t_2,t_3
\in {\cal T}} \enspace.$$
That is, for every $t,t_1,t_2,t_3 \in {\cal T}$ we have
an (auxiliary) variable $x'_{t,t_1,t_2,t_3}$.
Note that the total number of auxiliary
variables is polynomial in $n$.

$f$ will be defined in the following way.
For every $t \in {\cal T}$, such that $l(t)$ is even,
we will define a multilinear polynomial
$f_t$. We then define
$$ f = f_{([n],\emptyset)}\enspace.$$
We define the polynomials $f_t$ by induction on $L(t)$:

\begin{description}
\item[Case 1] $L(t) = l(t) =0$.
We define in this case, $f_t = 1$.

\item[Case 2] $0 < L(t) \leq 2$.
Since $l(t)$ is even, $l(t) =2$.
Hence, the union of the two intervals in~$t$ contains two
indices. Denote these indices by $i_t,j_t$.
We define in this case,
$$f_t = x_{i_t} \cdot x_{j_t} + 1\enspace.$$
Note that for the two possible partitions of
$\{ x_{i_t}, x_{j_t} \}$ into
$\{ y_1 \} \;\dot\cup\; \{ z_1 \}$, the partial derivatives matrix
of $f_t$ is the identity matrix of size $2 \times 2$
and is hence of rank 2 (i.\,e., full rank).

\item[Case 3] $L(t) > 2$.
Since $l(t)$ is even, $l(t)$ is at least 4.
We define in this case,
$$f_t = \sum_{\{t_1,t_2,t_3\} \in {\cal P}(t)}
x'_{t,t_1,t_2,t_3} \cdot f_{t_1} \cdot f_{t_2} \cdot f_{t_3}\enspace.$$
Observe that (by the inductive definition)
for any $\{t_1,t_2,t_3\}$ that give
a partition of $t$,
the polynomials $f_{t_1}, f_{t_2}, f_{t_3}$ depend on disjoint
sets of variables.
Hence, since we only sum over $\{t_1,t_2,t_3\}$ that give
partitions of $t$,
it follows by induction that
the polynomial $f_t$ is multilinear.
\end{description}


\subsection{Upper bound}

The inductive definition of $f$ gives a syntactic multilinear
circuit for $f$. Note that since we defined an arithmetic circuit
to be of fan-in (i.\,e., in-degree)~2
(see \secref{s:def}), we need to replace
the sum in the definition of each $f_t$ by a tree of
depth $O(\log n)$ of sum gates (of in-degree 2).

The final circuit is of size polynomial in $n$, since the size
of ${\cal T}$ (and hence also the size of~$X'$
and the size of ${\cal P}(t)$ for every $t \in {\cal T}$)
is polynomial in $n$.

The circuit is of depth $O(\log^2n)$, since
in the definition of $f_t$
we only sum over $\{t_1,t_2,t_3\}$
with $L(t_1), L(t_2), L(t_3) \leq 0.75 \cdot L(t)$
and since $L(([n], \emptyset)) < n$.
(Note that this gives a depth of $O(\log n)$,
but since we replace every sum
by a tree of depth $O(\log n)$ of sum gates we get another
factor of $O(\log n)$).

\begin{corollary} \label{ccc1}
Over any field $\F$,
the polynomial
$f$ (as defined above)
can be computed by a polynomial-size syntactic multilinear circuit
of depth $O(\log^2 n)$.
\end{corollary}


\subsection{Lower bound}

We will now show that any multilinear formula for $f$,
over any field $\F$, is of size $n^{\Omega(\log n)}$.
For the proof, we use
\corref{c3}.

Let   $n=2m$.  Let $\G$ be a field extending $\F$,
such that the transcendental dimension of $\G$ over $\F$ is infinite,
that is, $\G$ contains
an infinite number of elements that are algebraically independent
over $\F$.
Define $A': X' \to \G$ to be such that the variables
in $X'$ are mapped to elements that are algebraically independent
over $\F$.

Let $A$ be any partition of the variables in $X$ into
$\{y_1,\dots,y_m\} \;\dot\cup\;  \{z_1,\dots,z_m\}$.
Denote by $f_{A',A}$ the polynomial $f$
after substituting
in every variable in $X'$ the value assigned to it by $A'$
and after replacing
every variable in $X$ by the variable assigned to it by $A$.

\begin{claim} \label{c:rank}
Over the field $\G$,
$$\Rank(M_{f_{A',A}}) = 2^m\enspace.$$
\end{claim}

\begin{proof}
In this proof,
the $\Rank$ function is always taken over the field $\G$.
For simplicity, we denote in this proof by $g$ the polynomial
$f_{A',A}$, and for every $t$ we denote by $g_t$
the polynomial $f_{t,A',A}$ (i.\,e., the polynomial
$f_{t}$ after substituting
in every variable in $X'$ the value assigned to it by $A'$
and after replacing
every variable in $X$ by the variable assigned to it by $A$).

Define the function $\tilde{A}: [n] \to \{1,-1\}$
by $\tilde{A}(i) = 1$ if $A(x_i) \in \{y_1,\dots,y_m\}$ and
$\tilde{A}(i) = -1$ if $A(x_i) \in \{z_1,\dots,z_m\}$.
For simplicity, we denote the set ${\cal B}_{\tilde{A}}$ also
by ${\cal B}_A$.
We will prove by induction on $L(t)$ that for every
$t \in {\cal B}_A$,
$$\Rank(M_{g_t}) \geq 2^{l(t)/2}\enspace.$$

For $L(t) = l(t) = 0$,
we defined
$f_t = 1$.
Hence,
$M_{g_t}$ is the $1 \times 1$ identity matrix and its rank is 1.
For $0 < L(t) \leq 2$, we know that $l(t) =2$, and
we defined
$f_t = x_{i_t} \cdot x_{j_t} + 1.$
Since $t \in {\cal B}_A$, the matrix
$M_{g_t}$ is the $2 \times 2$ identity matrix and its rank is 2.

For $L(t) > 2$,
$$f_t = \sum_{\{t_1,t_2,t_3\} \in {\cal P}(t)}
x'_{t,t_1,t_2,t_3} \cdot f_{t_1} \cdot f_{t_2} \cdot f_{t_3}\enspace.$$
Hence,
$$g_t = \sum_{\{t_1,t_2,t_3\} \in {\cal P}(t)}
A'(x'_{t,t_1,t_2,t_3}) \cdot g_{t_1} \cdot g_{t_2} \cdot g_{t_3}\enspace,$$
and by \propref{p3},
$$M_{g_t} = \sum_{\{t_1,t_2,t_3\} \in {\cal P}(t)}
A'(x'_{t,t_1,t_2,t_3}) \cdot M_{g_{t_1} \cdot g_{t_2} \cdot g_{t_3}}\enspace.$$
Therefore,
since every $A'(x'_{t,t_1,t_2,t_3})$ is algebraically independent
(over $\F$)
of all the other elements in the domain of $A'$
and
all the coefficients that occur in any of the
matrices in the sum\footnote{More precisely,
$A'(x'_{t,t_1,t_2,t_3})$ is transcendental over the field $\F$
extended by every other element in the domain of $A'$.
That field obviously contains any coefficient that occurs
in any of the matrices in the sum.},
$$\Rank(M_{g_t}) \geq \max_{\{t_1,t_2,t_3\} \in {\cal P}(t)}
\Rank( M_{g_{t_1} \cdot g_{t_2} \cdot g_{t_3}})\enspace.$$
By \lemref{l:part},
there exist $\hat{t_1},\hat{t_2},\hat{t_3} \in {\cal B}_A$,
such that ${\{\hat{t_1},\hat{t_2},\hat{t_3}\} \in {\cal P}(t)}$.
Thus, by \propref{p4} and by the inductive hypothesis
for $\hat{t_1},\hat{t_2},\hat{t_3}$,
\begin{align*}
\Rank(M_{g_t}) \geq
\Rank( M_{g_{\hat{t}_1} \cdot g_{\hat{t}_2} \cdot g_{\hat{t}_3}})
&=
\Rank( M_{g_{\hat{t}_1}}) \cdot
\Rank( M_{g_{\hat{t}_2}}) \cdot
\Rank( M_{g_{\hat{t}_3}})\\
& \geq
2^{l(\hat{t}_1)/2} \cdot
2^{l(\hat{t}_2)/2} \cdot
2^{l(\hat{t}_3)/2} = 2^{l(t)/2}\enspace .
\end{align*}

Since this is true for every $t \in {\cal B}_A$, we can apply it
to $t = ([n],\emptyset) \in {\cal B}_A$ and get
$$ \Rank(M_g) \geq 2^m\enspace. $$
Since $M_g$ is a matrix of size $2^m \times 2^m$, we actually have
an equality in the last formula.
\end{proof}

\begin{corollary} \label{ccc2}
Over any field $\F$,
any multilinear formula for the polynomial $f$
(as defined above)
is of size $n^{\Omega(\log n)}$.
\end{corollary}
\begin{proof}
Follows
immediately from \corref{c3} and \claimref{c:rank}.
\end{proof}

\subsection{Proof of \thmref{ttt1}}

\thmref{ttt1} follows immediately from
\corref{ccc1} and \corref{ccc2}.

\subsection{Proof of \lemref{l:part}} \label{sssl}

Before giving the proof of \lemref{l:part},
we will need to prove two other lemmas.

\begin{lemma} \label{l:pair}
Let $A$ be a function $A: [n] \to \{1,-1\}$.
Let $t =(s_1,s_2) \in {\cal B}_A$ be such that $l(t) > 2$
and $l(s_1),l(s_2) > 0$.
Then, there exist
$t_1,t_2 \in {\cal B}_A$, such that
$\{t_1,t_2\}$
is a proper partition of $t$.
\end{lemma}

\begin{proof}
Let    $s_1 = [i_1,j_1]$, $s_2 = [i_2,j_2]$.
Since $t \in {\cal B}_A$, we have $A(s_1) + A(s_2) =0$.
If $A(s_1) = A(s_2) =0$ then we can define
$t_1 = (s_1, \emptyset)$, $t_2= (\emptyset, s_2)$.
Otherwise, we can assume w.l.o.g. that $A(s_1)$ is negative
and $A(s_2)$ is positive.

If $A(i_1) \neq A(i_2)$, we can define
$t_1 = ([i_1,i_1], [i_2,i_2])$,
$t_2 = ([i_1+1,j_1], [i_2+1,j_2])$.
Otherwise, we can assume w.l.o.g. that $A(i_1) = A(i_2) =1$.

Since $A(s_1)$ is negative and $A(i_1) = 1$, there must exist
$j' \in s_1$, such that $A([i_1,j']) =0$.
We can then define
$t_1 = ([i_1,j'], \emptyset)$,
$t_2 = ([j'+1,j_1], s_2)$.
Since we required $l(t) > 2$
and $l(s_1),l(s_2) > 0$, we have in all cases
$l(t_1),l(t_2) > 0$, and hence
$\{t_1,t_2\}$ is a proper partition of $t$.
\end{proof}

\begin{lemma} \label{l:trip}
Let $A$ be a function $A: [n] \to \{1,-1\}$.
Let $t = (s_1,s_2) \in {\cal B}_A$ be such that $l(t) > 2$
and $l(s_1),l(s_2) > 0$.
Then, there exist
$t_1,t_2,t_3 \in {\cal B}_A$, such that
$\{t_1,t_2,t_3\}$
is a partition of $t$, and
\begin{enumerate}
\item $L(t_1), L(t_3) \leq 0.5 \cdot L(t)$.
\item $L(t_2) \leq 0.75 \cdot L(t)$.
\item $l(t_2) \leq \max(l(s_1),l(s_2))$.
\end{enumerate}
\end{lemma}

\begin{proof}
First note that since $l(s_1),l(s_2) > 0$, we have
$L(t)=l(t)$, and since
$l(t) > 2$ and is even, $L(t)=l(t) \geq 4$.
We will describe a procedure for finding $t_1,t_2,t_3$
with the required properties.

We start with $\hat{t}_1 =(\emptyset, \emptyset)$,
$\hat{t}_2 =(s_1, s_2)$ and
$\hat{t}_3 =(\emptyset, \emptyset)$.
Note that $t = \hat{t}_1 \circ \hat{t}_2 \circ \hat{t}_3$.

\begin{claim} \label{c:rec}
Let $t'_1,t'_2,t'_3 \in {\cal B}_A$ be such that
$t = t'_1 \circ t'_2 \circ t'_3$.
Assume that $l(t'_1),l(t'_3) \leq 0.5 \cdot l(t)$
and that both intervals in $t'_2$ are non-empty, and
$l(t'_2) > 2$.
Then, there exist
$t''_1,t''_2,t''_3 \in {\cal B}_A$, such that
$t = t''_1 \circ t''_2 \circ t''_3$, and
$l(t''_1),l(t''_3) \leq 0.5 \cdot l(t)$, and
$l(t''_2) < l(t'_2)$.
\end{claim}

\begin{proof}
By \lemref{l:pair} (applied to $t'_2$),
there exist
$\tilde{t}_1,\tilde{t}_3 \in {\cal B}_A$, such that
$\{\tilde{t}_1,\tilde{t}_3\}$
is a proper partition of $t'_2$.
Since $t = t'_1 \circ t'_2 \circ t'_3$ and since
$t'_2 = \tilde{t}_1 \circ \tilde{t}_3$,
we have $t = t'_1 \circ \tilde{t}_1 \circ
\tilde{t}_3 \circ t'_3$.

If $l(t'_1) + l(\tilde{t}_1) \leq 0.5 \cdot l(t)$ then
we can define
$t''_1 = t'_1 \circ \tilde{t}_1$,
$t''_2 = \tilde{t}_3$,
$t''_3 = t'_3$.
Otherwise, $l(\tilde{t}_3) + l(t'_3) \leq 0.5 \cdot l(t)$,
and we can define
$t''_1 = t'_1$,
$t''_2 = \tilde{t}_1$,
$t''_3 = \tilde{t}_3 \circ t'_3$.

Since $\{\tilde{t}_1,\tilde{t}_3\}$
is a proper partition of $t'_2$,
in both cases $l(t''_2) < l(t'_2)$.
\end{proof}

We now continue with the proof of \lemref{l:trip}.
We apply \lemref{c:rec} on
$t'_1= \hat{t}_1$, $t'_2=\hat{t}_2$, $t'_3=\hat{t}_3$,
and we substitute (i.\,e., redefine)
$\hat{t}_1 \doteq t''_1$,
$\hat{t}_2 \doteq t''_2$,
$\hat{t}_3 \doteq t''_3$.
We keep applying \claimref{c:rec} and substituting
in  $\hat{t}_1, \hat{t}_2, \hat{t}_3$, until
the conditions of \claimref{c:rec} are not satisfied
by $\hat{t}_1, \hat{t}_2, \hat{t}_3$,
namely, either $l(\hat{t}_2) \leq 2$ or
one of the intervals in $\hat{t}_2$ is empty.
(Note that the process must stop because $l(\hat{t}_2)$
keeps decreasing.)
At this point we can define
$t_1 = \hat{t}_1$,
$t_2 = \hat{t}_2$,
$t_3 = \hat{t}_3$.

Since $t_1,t_2,t_3$ are the output of \claimref{c:rec},
$t_1,t_2,t_3 \in {\cal B}_A$, and
$\{t_1,t_2,t_3\}$
is a partition of~$t$, and
$L(t_1), L(t_3) \leq 0.5 \cdot l(t) = 0.5 \cdot L(t)$.
It remains to prove that
$L(t_2) \leq 0.75 \cdot L(t)$,
and $l(t_2) \leq \max(l(s_1),l(s_2))$.
Recall that there were two possibilities:
either $l(t_2) \leq 2$ or
one of the intervals in $t_2$ is empty.

In the first case, $L(t_2) \leq l(t_2) \leq 2$.
Since $L(t)=l(t) \geq 4$,
we have in the first case, $L(t_2) \leq  0.5 \cdot L(t)$,
and $l(t_2) \leq 0.5 \cdot l(t) \leq \max(l(s_1),l(s_2))$.

In the second case,
$L(t_2) = 0.75 \cdot l(t_2) \leq  0.75 \cdot l(t) = 0.75 \cdot L(t)$.
Since the non-empty interval of $t_2$ is a sub-interval of either
$s_1$ or $s_2$, we have
$l(t_2) \leq \max(l(s_1),l(s_2))$.
\end{proof}

\begin{proof}[Proof of \lemref{l:part}]
Let   $t = (s_1,s_2)$.
If $l(s_1),l(s_2) > 0$, then the proof follows by
\lemref{l:trip}. Otherwise, one of the intervals
$s_1,s_2$ is empty. W.\,l.\,o.\,g. assume that $s_1$ is empty.
Then, since $t \in {\cal B}_A$, we know that $l(s_2)= l(t)$
is even.
Partition $s_2$ into two intervals $\{s'_1,s'_2\}$ with
$l(s'_1)=l(s'_2)= 0.5 \cdot l(s_2)$. The proof now follows by
applying \lemref{l:trip} on $t'=(s'_1,s'_2)$ as follows.

Note that $L(t) = 0.75 \cdot l(t) =
0.75 \cdot l(t') = 0.75 \cdot L(t')$.
By \lemref{l:trip}
there exist
$t_1,t_2,t_3 \in {\cal B}_A$, such that
$\{t_1,t_2,t_3\}$
is a partition of $t'$ (and hence also of $t$), and
$L(t_1), L(t_3) \leq 0.5 \cdot L(t') < 0.75 \cdot L(t)$, and
$L(t_2) \leq l(t_2) \leq \max(l(s'_1),l(s'_2)) =
0.5 \cdot l(t) < 0.75 \cdot L(t)$.
\end{proof}



\subsection*{Acknowledgment}
I would like to thank Amir Shpilka
for very helpful conversations.


\bibliography{a006}

\begin{tocauthors}
  \begin{tocinfo}[ran]
    Ran Raz\\
    Professor \\
    Department of Computer Science and Applied Mathematics \\
    Weizmann Institute of Science\\
    Rehovot 76100, Israel \\
    ran.raz\tocat weizmann\tocdot ac\tocdot il\\
    \url{http://www.wisdom.weizmann.ac.il/~ranraz/}
  \end{tocinfo}
\end{tocauthors}

\begin{tocaboutauthors}
  \begin{tocabout}[ran]
    \textsc{Ran Raz} received his Ph.\,D. in 1992 from Hebrew University 
under the supervision of Michael Ben-Or and Avi Wigderson.
Since 1994, he has been a faculty member in the Faculty of
Mathematics and Computer Science at the 
\href{http://www.weizmann.ac.il/}{Weizmann Institute}. 
His main research area is complexity theory, with emphasis
on proving lower bounds for computational models.
More specifically, he is interested in Boolean circuit complexity,
arithmetic circuit complexity, communication complexity,
propositional proof theory, probabilistically checkable proofs,
quantum computation and communication, and
randomness and derandomization.
  \end{tocabout}
\end{tocaboutauthors}
\end{document}


