Theory of Computing
-------------------
Title : Learning $k$-Modal Distributions via Testing
Authors : Constantinos Daskalakis, Ilias Diakonikolas, and Rocco A. Servedio
Volume : 10
Number : 20
Pages : 535-570
URL : https://theoryofcomputing.org/articles/v010a020
Abstract
--------
A $k$-modal probability distribution over the discrete domain
$\{1,...,n\}$ is one whose histogram has at most $k$ "peaks" and
"valleys." Such distributions are natural generalizations of monotone
($k=0$) and unimodal ($k=1$) probability distributions, which have
been intensively studied in probability theory and statistics.
In this paper we consider the problem of _learning_ (i.e., performing
density estimation of) an unknown $k$-modal distribution with respect
to the $L_1$ distance. The learning algorithm is given access to
independent samples drawn from an unknown $k$-modal distribution $p$,
and it must output a hypothesis distribution $\widehat{p}$ such that
with high probability the total variation distance between $p$ and
$\widehat{p}$ is at most $\eps.$ Our main goal is to obtain
_computationally efficient_ algorithms for this problem that use
(close to) an information-theoretically optimal number of samples.
We give an efficient algorithm for this problem that runs in time
$\poly(k,\log(n),1/\eps)$. For $k \leq \tilde{O}( {\log n})$, the
number of samples used by our algorithm is very close (within an
$\tilde{O}(\log(1/\eps))$ factor) to being information-theoretically
optimal. Prior to this work computationally efficient algorithms were
known only for the cases $k=0,1$ (Birg\'e 1987, 1997).
A novel feature of our approach is that our learning algorithm
crucially uses a new algorithm for _property testing of probability
distributions_ as a key subroutine. The learning algorithm uses the
property tester to efficiently decompose the $k$-modal distribution
into $k$ (near-)monotone distributions, which are easier to learn.
A preliminary version of this work appeared in the Proceedings of
the Twenty-Third Annual ACM-SIAM Symposium on Discrete
Algorithms (SODA 2012).