We present a new and simple methodology for deriving information theoretic lower bounds for problems distributed statistics. In our setting, data samples are distributed across multiple players, who are only allowed to share “limited information” about it to a center. We set up a general formulation to capture information constraints such as communication constraints or local privacy constraints as special cases, and we allow interactive protocols. Using our approach, we can derive tight lower bounds for many problems such as distributed high-dimensional mean estimation, discrete distribution learning, and goodness-of-fit testing. Perhaps more importantly, our tools are general enough to be applied to many applications in Federated Learning, a topic receiving great interest from industry and researchers over the past couple of years. In the spirit of the CSSP seminar, we will present the main technical components of our general approach.
This work is a part of an ongoing long-term collaboration with Jayadev Acharya and Clément Canonne. The specific results I will present in this talk were obtained in joint work with Yuhan Liu and Ziteng Sun as well.
Thanks to Himanshu for allowing us to record the talk!
Some papers that were suggested during the talk
- Which Distribution Distances are Sublinearly Testable? - Constantinos Daskalakis, Gautam Kamath, John Wright (2017)
- Generalized Error Exponents For Small Sample Universal Hypothesis Testing - Dayu Huang, Sean Meyn (2013)