Paper ID: 2201.04315

On the Statistical Complexity of Sample Amplification

Brian Axelrod, Shivam Garg, Yanjun Han, Vatsal Sharan, Gregory Valiant

The ``sample amplification'' problem formalizes the following question: Given $n$ i.i.d. samples drawn from an unknown distribution $P$, when is it possible to produce a larger set of $n+m$ samples which cannot be distinguished from $n+m$ i.i.d. samples drawn from $P$? In this work, we provide a firm statistical foundation for this problem by deriving generally applicable amplification procedures, lower bound techniques and connections to existing statistical notions. Our techniques apply to a large class of distributions including the exponential family, and establish a rigorous connection between sample amplification and distribution learning.

Submitted: Jan 12, 2022