Welcome to the Princeton Theory Lunch page! The Princeton TCS Seminar, fondly known as Theory Lunch, is a weekly seminar organized by the Princeton TCS group every Friday from 1:00 pm to 2:00 pm, with food usually served at 12:45 pm. It is open to all members of the Princeton community.

Talks are also simultaneously broadcasted via Zoom. The Zoom links are shared every week by email. All talks are recorded and uploaded on our YouTube channel, Princeton TCS.

This semester theory lunch is organized by Elena Gribelyuk, Eric Xue, Jingyi Liu, and Stephen Newman.

If you'd like to be notified about future Theory Lunch talks please subscribe to the theory-read mailing list.

This Week on Theory Lunch

May 03, 2024 | Computer Science 105

Peter Manohar from Carnegie Mellon University

An Exponential Lower Bound for Linear 3-Query Locally Correctable Codes

A locally correctable code (LCC) is an error correcting code where one can recover any symbol of the original codeword by querying only a small number of randomly chosen symbols from the received corrupted codeword.

In this talk, we present a proof that the blocklength n of a linear 3-query LCC C: {0,1}^k -> {0,1}^n must be at least \exp(k^{1/8}), i.e., it grows exponentially with k. This improves on the prior best lower bound of k^3 shown in our recent work, which holds even for the weaker setting of 3-query locally decodable codes (LDCs), and comes close to matching the best-known construction of 3-query LCCs based on Reed—Muller codes, which achieve a blocklength of n = \exp(\sqrt{k}). Because there is a 3-query LDC with a strictly subexponential blocklength, as a corollary we obtain the first separation between q-query LCCs and LDCs for any constant q.

We subsequently show that any “design 3-LCC”, an LCC with a parity check matrix that is a block design, must have blocklength at least 2^{(1-o(1)) sqrt{k}}. As Reed—Muller codes are design 3-LCCs with blocklength n = 2^{2 sqrt{2k}}, this is tight up to a factor of 2 sqrt{2} in the exponent, and as a corollary resolves the Hamada conjecture for 4-designs up to this constant factor. For nonlinear LCCs, we give a generalization of our approach that proves a superpolynomial lower bound of k^{log k}, improving on the prior best lower bound of k^3.

Spring 2024 Schedule

Robust sublinear expanders

Matija Bucic from Princeton University

February 06, 2024 | Friend 004

Expander graphs are perhaps one of the most widely useful classes of graphs ever considered. In this talk, we will focus on a fairly weak notion of expanders called sublinear expanders, first introduced by Komlós and Szemerédi around 30 years ago. They have found many remarkable applications ever since. In particular, we will focus on certain robustness conditions one may impose on sublinear expanders and some applications of this idea, which include:

- recent progress on the classical Erdős-Gallai cycle decomposition conjecture,

- essentially tight answer to the classical Erdős unit distance problem for "most" real normed spaces and

- an asymptotic solution to the rainbow Turan problem for cycles, raised by Keevash, Mubayi, Sudakov and Verstraete, with an interesting corollary in additive number theory.

Based on joint works with Montgomery; Alon and Sauermann; and Alon, Sauermann, Zakharov and Zamir.

Testing Distributional Shift

Arsen Vasilyan from MIT

February 16, 2024 | Computer Science 201

In learning theory it is commonly assumed that the training dataset comes from the same distribution as the data we see during deployment. Yet, due to a distribution shift, this assumption might not hold, which can lead to incorrect predictions. Dealing with distribution shift is a big challenge in machine learning. For example, classifiers trained on data from one hospital often fail to generalize to data from other hospitals.

In our new work, joint with Adam Klivans and Konstantinos Stavropoulos, we introduce a framework called Testable Learning with Distribution Shift (TDS learning) to address this issue. The goal of our work is developing learning algorithms that allow their user to make sure that the performance of a learning algorithm is not deteriorating due to a distribution shift.

We consider the following setting. We are given a training dataset S_1, consisting of e.g. Gaussian examples which are labelled by an unknown function f in the function class. We are also given a new dataset S_2 of unlabelled data points which we need to classify. The concern is that even if a learning algorithm produces a classifier that performs very well on the distribution from which S_1 is drawn, it might perform terribly on S_2 due to a distribution shift.

Whenever distribution shift makes it unsafe to use the classifier given by the learning algorithm on the unlabeled dataset S_2, the TDS learning framework requires the learning algorithm to alert the user. In contrast, if no distribution shift has taken place, then the TDS learning framework requires the learning algorithm not to raise such an alert. In this case, the user can be confident that the classifier given by the learning algorithm is indeed safe to deploy on the dataset S_2.

Although domain-adaptation has been studied previously, most previous work bounds the error on S_2 in terms of some notion of distance between the distributions giving rise to S_1 and S_2. These distances seem computationally difficult to evaluate, as they involve enumeration of all elements of the function class.

In our work we give TDS learning algorithms that are not only sample-efficient but also computationally efficient. We handle a number of high-dimensional function classes, including linear classifiers, intersections of linear classifiers, decision trees and low-depth formulas.

To accomplish this, we develop a novel method based on showing that functions in a given function class can be sandwiched between pairs of low degree polynomials that are close in L_2-distance. We prove the existence of such pairs of L_2-sandwiching polynomials for a wide range of function classes, by building on techniques from pseudorandomness. We then show that the existence of these L_2-sandwiching polynomials can be used to achieve efficient TDS-learning algorithms by introducing our Transfer Lemma, which allows us to ensure a good performance on the set S_2 by checking that the low-degree moments of the set S2 approximately match those of S_1.

Furthermore, for TDS learning of linear classifiers under the Gaussian distribution, we give an algorithm that does provably better than any algorithm based on L_2-sandwiching polynomials. To accomplish this, we combine the moment-matching approach with a method inspired by literature on active learning.

Randomness Extraction from Adversarial Sources

Eshan Chattopadhyay from Cornell University

February 23, 2023 | Computer Science 105

The area of randomness extraction is concerned with designing efficient algorithms (extractors) that produce almost truly random bits from defective sources of randomness. A very well-studied model in this area is the independent source model, where the extractor has access to multiple independent sources, each of which contains some amount of entropy. In this talk, I will discuss a recent line of work that explores randomness extraction beyond the independence model, allowing some sources to be adversarial or faulty - such faulty sources may contain no entropy or have unwanted correlations. Our extractors are based on recent advances in the extractor machinery along with interesting use of tools from extremal graph theory and lower bound results in communication complexity.

Based on joint works with Jesse Goodman, Vipul Goyal, Xin Li and David Zuckerman.

Optimal Online Discrepancy Minimization

Victor Reis from Institute for Advanced Study

March 01, 2024 | Computer Science 105

We prove that there exists an online algorithm that for any sequence of vectors v_1, ..., v_T in R^n with Euclidean norm at most 1, arriving one at a time, decides random signs x_1, ..., x_T so that for every t <= T, the prefix sum x_1 v_1 + ... + x_t v_t is 10-subgaussian.

This improves over the work of Alweiss, Liu and Sawhney who kept prefix sums O(sqrt(log(nT)))-subgaussian, and gives a O(sqrt(log T)) bound on the prefix discrepancy. Our proof combines a generalization of Banaszczyk's prefix balancing result to trees with a cloning argument to find distributions rather than single colorings. We also show a matching Omega(sqrt(log T)) strategy for an oblivious adversary. Joint work with Janardhan Kulkarni and Thomas Rothvoss to appear in STOC 2024.

Can open decentralized ledgers be economically secure?

Jacob Leshno from Chicago Booth School of Business

March 08, 2024 | Computer Science 105

Traditional payment processors are the subject of antitrust concerns and regulations. Open decentralized ledgers (e.g., Bitcoin) provide an alternative. They do not rely on a central authority, avoiding antitrust and monopoly concerns. However, the open nature of these systems gives rise to many challenges, including fundamental questions about their security.

Using a framework that combines economic theory and distributed systems theory, we define economic security for general open decentralized ledgers.

Limitations of incentives in environments with many anonymous participants bring to question the security Bitcoin's Nakamoto protocol obtains from block rewards.

We present an alternative protocol showing that an open decentralized ledger can be economically secure.

(Joint with Rafael Pass and Elaine Shi)

Decoding Codes via Proofs

Madhur Tulsiani from TTIC

March 22, 2024 | Computer Science 105

In this talk, we will discuss a general framework for obtaining list decoding algorithms for codes, using relaxations based on the Sum-of-Squares (SoS) hierarchy of semidefinite programs. This framework is an adaptation of the well-known “proofs to algorithms” paradigm to the setting of codes. If the proof of the fact that all pairs of codewords have large distance can be expressed in a proof system corresponding to the SoS hierarchy, then one can use it to obtain a list decoding algorithm for the corresponding code. We will discuss a few examples of this phenomenon, including expander-based LDPC codes, for which no list decoding algorithms were previously known.

Based on joint work with Fernando Granha Jeronimo, Tushant Mittal, and Shashank Srivastava.

Strong spatial mixing for colorings on trees and its algorithmic applications

Kuikui Liu from MIT

March 29, 2024 | Computer Science 105

Correlation decay is a fundamental and important property of distributions arising in statistical physics and theoretical computer science. A longstanding conjecture is that the uniform distribution over proper q-colorings on any tree of maximum degree Δ exhibits a strong form of correlation decay whenever q≥Δ+1. It is surprising that such a basic question is still open, but then again it also highlights how much we still have to learn about random colorings. In this talk, I will discuss a near-resolution of this conjecture, as well as its algorithmic implications for sampling colorings on bounded-degree graphs via Glauber dynamics. Based on joint work with Zongchen Chen, Nitya Mani, and Ankur Moitra.

Data-Dependent LSH for the Earth Mover's Distance

Rajesh Jayaram from Google Research

April 05, 2024 | Computer Science 201

In this talk, we discuss the first improvement in approximation for nearest neighbor search under the Earth Mover's Distance (EMD) in over a decade. Given two sets of $s$ vectors $A,B$ in high dimensional space $(R^d)$, the EMD between $A$ and $B$ is the minimum cost of a perfect matching between the vectors in $A$ and $B$ where the edge weights are given by the distances in Euclidean space. EMD is a classic metric in computer science (dating back over 100 years to the Hungarian algorithm of Jacobi), and a standard distance between two sets of points. In nearest neighbor search, one has a collection of $n$ such sets $A_1,\ldots,A_n$, which one pre-processes so that given a query set $Q$, one can quickly return a set $A_i$ whose distance (under EMD) is within a $C$-factor of the nearest neighbor to $Q$.

To date, the only algorithm with sublinear $O(n^{\epsilon})$ query time was given by Andoni, Indyk, and Krauthgamer ([AIK, SODA '08]), who designed a (data-independent) locality sensitive hash function (LSH) for EMD with approximation $O(\log^2 s)$. In this work, we improve this approximation to $\tilde{O}(\log s)$ in the same runtime by designing the first data-dependent LSH functions for EMD. The talk will discuss the main techniques behind sublinear algorithms for EMD, including the tree embeddings techniques of [AIK’08] and [Chen, Jayaram, Levi, Waingarten STOC '22], as well as the key insights needed to glue them together into a improved LSH for EMD.

Joint work with Erik Waingarten and Tian Zhang (STOC '24, https://arxiv.org/abs/2403.05041).

Gaussian Polytope Approximators

Anindya De from University of Pennsylvania

April 12, 2024 | Computer Science 105

Abstract: We study the approximability of general convex sets in R^n

by intersections of halfspaces, where the approximation quality is

measured with respect to the standard Gaussian distribution and the

complexity of an approximation is the number of halfspaces used. While

a large body of research has considered the approximation of convex

sets by intersections of halfspaces under distance metrics such as the

Lebesgue measure and Hausdorff distance, prior to our work there has

not been a systematic study of convex approximation under the Gaussian


We establish a range of upper and lower bounds, both for general

convex sets and for specific natural convex sets that are of

particular interest. Our results demonstrate that the landscape of

approximation is intriguingly different under the Gaussian

distribution versus previously studied distance measures. We establish

our bounds using techniques from many different areas, including

classical results from convex geometry, Cramér-type bounds from

probability theory, and—perhaps surprisingly—a range of topics from

computational complexity, including computational learning theory,

unconditional pseudorandomness, and the study of influences and noise

sensitivity in the analysis of Boolean functions.

Based on joint work with Shivam Nadimpalli and Rocco Servedio. Preprint available here: https://arxiv.org/abs/2311.08575.

Efficiently Finding Planted Cliques in Semirandom Graphs

Pravesh Kothari from Princeton University

April 19, 2024 | Computer Science 105

In this talk, I will present a new polynomial time algorithm for recovering planted cliques in the semi-random graph model of Feige and Kilian from 2001. In the Feige-Kilian model, a graph G on vertex set V is chosen by a combination of benign random and adaptive "adversarial" choices and can be thought of as a "robust" counterpart of the well-studied "fully-random" planted clique problem:

1) choose a planted clique on a set of k vertices S,

2) "benign random choice": include each edge between S and V\S independently at random,

3) "adaptive worst-case choice": deleting any set of edges between S and V\S and adding any set of edges with both endpoints in V\S.

The previous best algorithms for this model succeed if the planted clique has a size of at least n^{2/3} in a graph with n vertices. Our algorithms work for planted-clique sizes approaching n^{1/2} -- the information-theoretic threshold in the semi-random model and a conjectured computational threshold even in the easier fully-random model. This result comes close to resolving an open question of Feige (2019) and Steinhardt (2017). Our algorithm is based on high constant degree sum-of-squares relaxation of the clique SDP and the analysis relies on a new connection between finding planted cliques and efficient certificates of upper bounds on clique number of unbalanced bicliques in bipartite random graphs.

Based on joint work with Rares Buhai and David Steurer.

Theoretical Foundations of Real-World Cryptograph

Vinod Vaikuntanathan from MIT

April 26, 2024 | Computer Science 105

Block ciphers (aka Pseudorandom Permutations) are the workhorses of modern cryptography. An overwhelming fraction of Internet traffic is secured using the Advanced Encryption Standard (AES), a block cipher; the same goes for encrypted file systems, encrypted messaging, and several other applications we use all the time. Yet, despite the fact that the constructions are natural and elegant, we have woefully little mathematical rationale for their security. In this talk, I will describe a research program, together with a collection of recent results and techniques, aimed at achieving a better mathematical understanding of the security of real-world block ciphers including AES.

This talk is based on joint work with Angelos Pelecanos, Tianren Liu and Stefano Tessaro.

An Exponential Lower Bound for Linear 3-Query Locally Correctable Codes

Peter Manohar from Carnegie Mellon University

May 03, 2024 | Computer Science 105

A locally correctable code (LCC) is an error correcting code where one can recover any symbol of the original codeword by querying only a small number of randomly chosen symbols from the received corrupted codeword.

In this talk, we present a proof that the blocklength n of a linear 3-query LCC C: {0,1}^k -> {0,1}^n must be at least \exp(k^{1/8}), i.e., it grows exponentially with k. This improves on the prior best lower bound of k^3 shown in our recent work, which holds even for the weaker setting of 3-query locally decodable codes (LDCs), and comes close to matching the best-known construction of 3-query LCCs based on Reed—Muller codes, which achieve a blocklength of n = \exp(\sqrt{k}). Because there is a 3-query LDC with a strictly subexponential blocklength, as a corollary we obtain the first separation between q-query LCCs and LDCs for any constant q.

We subsequently show that any “design 3-LCC”, an LCC with a parity check matrix that is a block design, must have blocklength at least 2^{(1-o(1)) sqrt{k}}. As Reed—Muller codes are design 3-LCCs with blocklength n = 2^{2 sqrt{2k}}, this is tight up to a factor of 2 sqrt{2} in the exponent, and as a corollary resolves the Hamada conjecture for 4-designs up to this constant factor. For nonlinear LCCs, we give a generalization of our approach that proves a superpolynomial lower bound of k^{log k}, improving on the prior best lower bound of k^3.

Fall 2023 Schedule

Sparsifying Generalized Linear Models

Yang Liu from Institute for Advanced Study

September 15, 2023 | Computer Science 105

We study the sparsification of objectives $f(x) = \sum_{i=1}^m f_i()$, for vectors $a_1, ..., a_m, x \in R^n$. Informally, we show that if all the functions $f_i$ are nearly symmetric, and grow subquadratically, then $f$ has be sparsified down to $\tilde{O}(n * \epsilon^{-2})$ terms with $(1+\epsilon)$ multiplicative error. Our result generalizes the classical case of $l_p$ norm sparsification, where the functions $f_i(x) = |x|^p$ for $0 < p \le 2$. As an application, we obtain nearly-optimal algorithms to solve $l_p$-norm regression to high accuracy for $1 < p \le 2$.

Cryptographic Non-Locality

Alex Lombardi from Princeton University

September 22, 2023 | Computer Science 105

Non-local games are an important concept in quantum information theory that allow two or more provers to demonstrate, to a classical verifier, the ability to prepare certain kinds of quantum systems. Two important applications include (multi-prover) protocols for (1) the demonstration of quantum advantage and (2) verifiable BQP computation. Key to these protocols is the physical assumption that the "far apart" provers cannot communicate.

In this talk, we will discuss how to emulate the effects of non-local games in an interactive, single-prover protocol, replacing the physical non-locality assumption with a cryptographic hardness assumption. These ideas have led to new interactive protocols for quantum advantage as well as for BQP verification.

This talk is based on joint works with Yael Kalai, Vinod Vaikuntanathan, Lisa Yang, and Fermi Ma.

2-dimensional spectral expansion of random geometric graphs

Siqi Liu from Institute for Advanced Study

September 29, 2023 | Computer Science 105

A graph is said to be a 1-dimensional expander if the second eigenvalue of its adjacency matrix is bounded away from 1, or if it has no sparse vertex cuts. There are several natural ways to generalize the notion of expansion to hypergraphs/simplicial complexes, but one such way is 2-dimensional spectral expansion, in which the local expansion of vertex links/neighborhoods witnesses global expansion. While 1-dimensional expansion is known to be achieved by, e.g., random regular graphs, very few examples of sparse 2-dimensional expanders are known, and at present all are algebraic. It is an open question whether sparse 2-dimensional expanders are natural and abundant.

In this talk, we give some evidence towards abundance: we show that the set of triangles in a random geometric graph on a high-dimensional sphere yields an expanding simplicial complex of arbitrarily small polynomial degree.

Based on joint work with Sidhanth Mohanty, Tselil Schramm, and Elizabeth Yang.

Thin Trees for Laminar Families

Nathan Klein from Institute for Advanced Study

October 06, 2023 | Computer Science 105

In the strong form of the thin tree conjecture, formulated by Goddyn in 2004, we are given a k-edge-connected graph and wish to find a tree containing at most an O(1/k) fraction of the edges across every cut. This beautiful conjecture, if true, has implications in a number of areas such as graph theory and approximation algorithms. A natural relaxation of this problem is to find a tree with at most O(1/k) edges across a fixed collection of cuts, instead of all cuts. Via a simple reduction technique followed by iterative relaxation, we show that the thin tree conjecture is true for laminar families of cuts. This resolves an open question of Olver and Zenklusen from 2013, a work that showed thin trees exist for chains of cuts. As a byproduct, we also obtain the first constant factor approximation algorithm for the laminar-constrained spanning tree problem. Based on joint work with Neil Olver.

Theoretical foundations for diffusion models

Sitan Chen from Harvard University

October 27, 2023 | Computer Science 105

I will describe recent progress on providing rigorous convergence guarantees for score-based generative models (SGMs) such as denoising diffusion probabilistic models (DDPMs), which constitute the backbone of large-scale real-world generative models such as DALL⋅E. In the first part of the talk, I will show that such SGMs can efficiently sample from essentially any realistic data distribution, even ones which are highly non-log-concave. In the second part of the talk, I will show how to extend these guarantees to deterministic samplers based on discretizing the so-called probability flow ODE, which ultimately leads to better dependence on the dimension. All of these results assume access to an oracle for score estimation; time permitting, at the end I will briefly touch upon how to provably implement this oracle for interesting classes of distributions like Gaussian mixtures.

Streaming Zero-Knowledge Proofs

Marcel Dall'Agnol from Princeton University

November 03, 2023 | Computer Science 105

We define the notion of zero-knowledge in the streaming setting and construct zero-knowledge SIPs for the two main building blocks in the streaming interactive proofs literature: the sumcheck and polynomial evaluation protocols. To the best of our knowledge all known streaming interactive proofs are based on either of these tools, and indeed, this allows us to obtain zero-knowledge SIPs for central and well-studied streaming problems, such as index, frequency moments, and inner product. Our protocols are efficient both in terms of time and space, as well as communication: the space complexity is polylog(n) and, after a non-interactive setup that uses a random string of near-linear length, the remaining parameters are sub-polynomial.

En route, we develop a toolkit for designing zero knowledge data stream protocols that may be of independent interest, consisting of an algebraic streaming commitment protocol and a temporal commitment protocol. The analysis of our protocols relies on delicate algebraic and information-theoretic arguments and reductions from average-case communication complexity.

Samplers from high dimensional expanders and agreement testing

Yotam Dikstein from Institute for Advanced Study

November 10, 2023 | Computer Science 105

Sampler graphs are sparse graphs that "derandomize" Chernoff’s inequality. These graphs are equivalent to randomness extractors and are widely used in other areas of TCS. In this talk we will see that high dimensional expanders (HDXs) give rise to nearly optimal samplers. These are hypergraph analogues of expander graphs. Then we will see some applications such as reverse hypercontractivity inequalities on HDXs and new agreement theorems. This is a joint work with Max Hopkins (UCSD).

Testing Convex Truncation

Rocco Servedio from Columbia University

November 17, 2023 | Computer Science 105

We study the basic statistical problem of testing whether normally distributed -dimensional data has been truncated, i.e. altered by only retaining points that lie in some unknown truncation set . As our main algorithmic results,

We give a computationally efficient -sample algorithm that can distinguish the standard normal distribution from conditioned on an unknown and arbitrary convex set .

We give a different computationally efficient -sample algorithm that can distinguish from conditioned on an unknown and arbitrary mixture of symmetric convex sets.

These results stand in sharp contrast with known results for learning or testing convex bodies with respect to the normal distribution or learning convex-truncated normal distributions, where state-of-the-art algorithms require essentially samples. An easy argument shows that no finite number of samples suffices to distinguish from an unknown and arbitrary mixture of general (not necessarily symmetric) convex sets, so no common generalization of results (1) and (2) above is possible.

We also prove lower bounds on the sample complexity of distinguishing algorithms (computationally efficient or otherwise) that match our algorithms up to constant factors.

Joint with Anindya De and Shivam Nadimpalli.

Communication Complexity of Multiparty Set Disjointness Under Product Distributions

Rotem Oshman from Tel-Aviv University

December 01, 2023 | Computer Science 105

In the multiparty set disjointness problem, we have k players, with private inputs X1,...,Xk \subseteq [n]. The players’ goal is to check whether their sets intersect, or sometimes to compute the full intersection. It is known that in the shared blackboard model of communication, set disjointness requires Omega(n log k + k) bits of communication, and in the coordinator model, it requires Omega(nk) bits. However, these two lower bounds require that the players’ inputs can be highly correlated.

We study the communication complexity of multiparty set disjointness under product distributions, and ask whether the problem becomes significantly easier, as it is known to become in the two party case. Our main result is a nearly-tight bound of tilde-Theta(n^{1-1/k}) bits for both the shared blackboard model and the coordinator model. This shows that in the shared blackboard model, as the number of players grows, having independent inputs helps less and less; but

in the coordinator model, when k is very large, having independent inputs makes the problem much easier. Both our upper and our lower bounds use new ideas, as the original techniques developed for the two-party case do not scale to more than two players.

A new Information Complexity measure for multi-pass streaming algorithms

Sumegha Garg from Rutgers University

December 08, 2023 | Computer Science 105

In this talk, we will introduce a new notion of information complexity for multi-pass streaming problems, and use it to prove multi-pass lower bounds for the coin problem. In the coin problem, one sees a stream of 𝑛 i.i.d. uniform bits and one would like to compute the majority with constant advantage. We show that any constant pass algorithm must use Ω(log 𝑛) bits of memory, significantly extending an earlier Ω(log 𝑛) bit lower bound for single-pass algorithms of [BGW'20]. In the paper, we also use this information complexity notion to prove tight space complexity for the needle problem, which in turn implies tight bounds for the problem of approximating higher frequency moments in a data stream.

Joint work with Mark Braverman, Qian Li, Shuo Wang, David P. Woodruff and Jiapeng Zhang.

Tight Bounds for Active lp Regression

David Woodruff from Carnegie Mellon University

December 15, 2023 | Computer Science 105

In active learning, one seeks to learn the parameters of a model given examples, while minimizing the number of labels read. We give tight bounds up to logarithmic factors on the number of labels that need to be read for active l_p regression for every value of p, as a function of the number of variables and accuracy desired. Previously tight bounds were known only for p = 1 or p =2. We extend these ideas to a large class of loss functions with certain nice conditions, obtaining the first non-trivial bounds for such functions.

Based on joint works, one with Musco, Musco, and Yasuda (FOCS '22) and another with Yasuda (STOC '23).

Spring 2023 Schedule

Faster Walsh-Hadamard Transform from Matrix Non-Rigidity

Josh Alman from IAS

February 10, 2023 | Computer Science 105

The Walsh-Hadamard transform is a simple recursively-defined linear transformation with many applications throughout algorithm design and complexity theory. The folklore "fast Walsh-Hadamard transform" algorithm computes it using only N log N arithmetic operations, which is believed to be optimal up to constant factors. In this talk, I'll give two improved constructions:

- A new algorithm which uses $(23/24) N log N + O(N)$ arithmetic operations. This is, to my knowledge, the first improvement of the leading constant. (Based on joint work with Kevin Rao.)

- A new depth-2 linear circuit of size $O(N^{1.45})$, improving on size $O(N^{1.5})$ which one gets from the fast Walsh-Hadamard transform approach. (Based on joint work with Ashwin Padaki and Yunfeng Guan.)

A key idea behind both constructions is to take advantage of the matrix non-rigidty of the Walsh-Hadamard transform. A matrix is called "rigid" if it's impossible to change a "small" number of its entries to make it have "low" rank. L. Valiant showed that if a matrix is rigid, then this implies lower bounds for computing its linear transformation. However, a recent line of work has shown that many matrices of interest, including the Walsh-Hadamard transform, are actually not rigid. Although a formal converse to Valiant's result is not known, we'll make use of non-rigidity in our new constructions.

A Subpolynomial Approximation Algorithm for Graph Crossing Number in Low-Degree Graphs

Zihan Tan from Rutgers / DIMACS

February 17, 2023 | Computer Science 105

Graph Crossing Number is a fundamental and extensively studied problem with wide ranging applications. In this problem, the goal is to draw an input graph G in the plane so as to minimize the number of crossings between the images of its edges. The problem is notoriously difficult, and despite extensive work, non-trivial approximation algorithms are only known for bounded-degree graphs. Even for this special case, the best current algorithm achieves a O(n)-approximation, while the best current negative results do not rule out constant-factor approximation. All current approximation algorithms for the problem build on the same paradigm, which is also used in practice: compute a set E’of edges (called a planarizing set) such that G \ E’ is planar; compute a planar drawing of G \ E’; then add the drawings of the edges of E to the resulting drawing. Unfortunately, there are examples of graphs G, in which any implementation of this method must incur Ω(OPT2) crossings, where OPT is the value of the optimal solution. This barrier seems to doom the only currently known approach to designing approximation algorithms for the problem, and to prevent it from yielding a better than O(n)-approximation.

We propose a new paradigm that allows us to overcome this barrier. Using the new paradigm, we reduce the Crossing Number problem to Crossing Number with Rotation System – a variant of the Crossing Number problem, in which the ordering of the edges incident to every vertex is fixed as part of input. We then show a randomized algorithm for this new problem, that allows us to obtain a subpolynomial approximation for Graph Crossing Number on low-degree graphs.

This talk is based on joint work with Julia Chuzhoy and Sepideh Mahabadi.

Almost linear time algorithms for all flows

Sushant Sachdeva from University of Toronto

February 24, 2023 | Computer Science 105

Over the last decade or so, combining methods from continuous optimization and analysis with graph theoretic-insights has led to a revolution in algorithms for classic problems on graphs such as maximum flow.

In this talk, I will present some of our key ideas behind our recent work that gives almost-linear time algorithms for solving all convex flow problems on graphs. This implies almost linear time algorithms for max-flow, minimum-cost flow, bipartite matching, optimal transport, matrix scaling, isotonic regression, and several other well-studied problems.

Our algorithm is designed using a new Interior Point Method (IPM) that builds the flow as a sequence of almost-linear number of approximate undirected minimum-ratio cycles, each of which is computed and processed very efficiently using a new dynamic data structure.

Joint work with Li Chen, Rasmus Kyng, Yang Liu, Richard Peng, and Maximilian Probst Gutenberg.

Robust and Equitable Uncertainty Estimation

Aaron Roth from University of Pennsylvania

March 03, 2023 | Friend Center 004

Machine learning provides us with an amazing set of tools to make predictions, but how much should we trust particular predictions? To answer this, we need a way of estimating the confidence we should have in particular predictions of black-box models. Standard tools for doing this give guarantees that are averages over predictions. For instance, in a medical application, such tools might paper over poor performance on one medically relevant demographic group if it is made up for by higher performance on another group. Standard methods also depend on the data distribution being static --- in other words, the future should be like the past.

In this talk, I will describe new techniques to address both these problems: a way to produce prediction sets for arbitrary black-box prediction methods that have correct empirical coverage even when the data distribution might change in arbitrary, unanticipated ways and such that we have correct coverage even when we zoom in to focus on demographic groups that can be arbitrary and intersecting. When we just want correct group-wise coverage and are willing to assume that the future will look like the past, our algorithms are especially simple. Our approach is based on a quantile analogue of (mean) multi-calibration, a recent solution concept arising from the algorithmic fairness literature. We close by asking which distributional properties (beyond means and quantiles) we can achieve multicalibration with respect to, and provide a complete characterization via a connection to property elicitation.

This talk is based on two papers, that are joint work with Osbert Bastani, Varun Gupta, Chris Jung, Georgy Noarov, and Ramya Ramalingam.

Last-Iterate Convergence in Min-Max Optimization: SOS to the Rescue

Yang Cai from Yale University

March 10, 2023 | Computer Science 105

Min-max optimization is a classical problem in optimization and game theory and has recently played a central role in modern machine learning (ML) applications. These applications range from generative adversarial networks, adversarial examples, robust optimization, to multi-agent reinforcement learning. These ML applications present novel challenges to min-max optimization, necessitating that algorithms (i) demonstrate last-iterate convergence, in contrast to the more traditional average-iterate convergence, and (ii) be robust in nonconvex-nonconcave environments.

In this talk, we first focus on the last-iterate convergence rates of two classical and widely-used algorithms for solving convex-concave min-max optimization – the extragradient (EG) method by Korpelevich (1976) and the optimistic gradient (OG) method by Popov (1980). Despite the considerable attention these algorithms have received from both the optimization and machine learning communities over the years, the last-iterate convergence rates for both algorithms remained open. We resolve this open problem by showing both EG and OG exhibit a tight O(1/\sqrt{T}) last-iterate convergence rate under arbitrary convex constraints. In the second part of the talk, we discuss some recent progress on designing optimal first-order methods for structured nonconvex-nonconcave min-max optimization. Our results are obtained via a new sum-of-squares programming based approach, which may be useful in analyzing other iterative methods. The talk is based on joint work with Argyris Oikonomou and Weiqiang Zheng.

1) On the Structural Complexities of Matching Mechanisms; 2) A Computational Separation Between Quantum No-cloning and No-teleportation

Clay Thomas and Barak Nehoran from Princeton University

March 24, 2023 | Computer Science 105

Talk 1: School districts in many cities employ school choice mechanisms, i.e., algorithms matching students to schools on the basis of students’ reported preferences, and the priorities students have for attending each school. Since simplicity and ease of participation are paramount to these mechanisms, an exact understanding of their structural complexities—e.g., how the matching can be described, and how one student’s input can affect the outputs—should be a first-order concern.

In this talk, I will give one example of this type of complexity: that of describing the outcome matching. Prior work has observed that for one popular real-world mechanism (DA), the outcome can be described in terms of a single "priority cutoff" per school, even when there are many more students than there are schools. We prove that, for another popular mechanism (TTC), the outcome cannot be described using less than a bit for each pair of schools. This result clarifies and formalizes one way in which, as often informally argued in both the literature and in practice, TTC may be more complex than DA.

Depending on time, I may also mention connections to verification (in the sense of nondeterministic communication), and how we formalize the question "how sensitive is the mechanism to one student’s input" in this setting.

Based on joint work with Yannai Gonczarowski.

Talk 2: Two of the fundamental no-go theorems of quantum information are the no-cloning theorem (that it is impossible to make copies of general quantum states) and the no-teleportation theorem (the prohibition on sending quantum states over classical channels without pre-shared entanglement). They are known to be equivalent, in the sense that a collection of quantum states is clonable if and only if it is teleportable.

Our main result suggests that this is not the case when computational efficiency is considered. We give a collection of quantum states and oracles relative to which these states are efficiently clonable but not efficiently teleportable. Given that the opposite scenario is impossible (states that can be teleported can always trivially be cloned), this gives the most complete oracle separation possible between these two important no-go properties.

In doing so, we introduce a related quantum no-go property, reconstructibility, which refers to the ability to construct a quantum state from a uniquely identifying classical description. We show the stronger result of a collection of quantum states that are efficiently clonable but not efficiently reconstructible. This novel no-go property only exists in relation to computational efficiency, as it is trivial for unbounded computation. It thus opens up the possibility of further computational no-go properties that have not yet been studied because they do not exist outside the computational context.

Joint work with Mark Zhandry.

New algorithms and lower bounds for decision tree learning

Li-Yang Tan from Stanford University

April 07, 2023 | Computer Science 105

Constructing decision tree representations of data is a basic problem of computer science, sitting right at the boundary of our understanding of efficient computation. I will discuss two recent results about this problem, a new algorithm and a new lower bound:

1. An almost-polynomial time membership query algorithm for the uniform-distribution setting. The previous fastest algorithm ran in quasipolynomial time, a consequence of a classic algorithm of Ehrenfeucht and Haussler from 1989. (https://dl.acm.org/doi/abs/10.1145/3561047)

2. A conditional quasipolynomial lower bound for the distribution-free setting, matching the upper bound of Ehrenfeucht and Haussler. This builds on and improves upon prior work by Alekhnovich, Braverman, Feldman, Klivans, and Pitassi from 2004. (https://arxiv.org/abs/2210.06375)

Joint work with Guy Blanc, Caleb Koch, Jane Lange, Mingda Qiao, and Carmen Strassle.

The Kikuchi Matrix Method

Pravesh Kothari from Carnegie Mellon University

April 14, 2023 | Friend Center 004

In this talk, I will present a new method that reduces understanding an appropriate notion of girth in hypergraphs to unraveling the spectrum of a "Kikuchi" matrix associated with the hypergraph.

I will discuss three applications of this technique:

1) Finding a refutation algorithm for smoothed instances of constraint satisfaction problems (obtained by randomly perturbing the literal patterns in a worst-case instance with a small probability) that matches the best running-time vs constraint-density trade-offs for the significantly special and easier case of random CSPs,

2) Confirming Feige's 2008 Conjecture that postulated an extremal girth vs density trade-off (a.k.a. Moore bounds) for k-uniform hypergraphs that generalizes the Alon-Hoory-Linial Moore bound for graphs,

3) Proving a cubic lower bound on the block length of 3 query locally decodable codes improving on the prior best quadratic lower bound from the early 2000s.

Based on joint works with Omar Alrabiyah (Berkeley), Tim Hsieh (CMU), Peter Manohar (CMU), Sidhanth Mohanty (Berkeley), and Venkat Guruswami (Berkeley).

On Regularity Lemma and Barriers in Streaming and Dynamic Matching

Sanjeev Khanna from University of Pennsylvania

April 21, 2023 | Computer Science 105

We present a new approach for finding matchings in dense graphs by building on Szemeredi's celebrated Regularity Lemma. This allows us to obtain non-trivial albeit (very) slight improvements over long-standing bounds for computing near-optimal matchings in streaming and dynamic graphs. The key idea behind our improved bounds is to maintain a matching cover which is a “sparse” subgraph that approximately preserves matchings in each induced subgraph of the input graph. However, given the use of regularity lemma, the improvement obtained by our algorithms over trivial bounds is only by some function of $(log* n)$. Nevertheless, these results show that the “right” answer to these problems is not what is dictated by the previous bounds.

This is joint work with Sepehr Assadi, Soheil Behnezhad, and Huan Li.

Unitary Complexity and the Uhlmann Transformation Problem

Henry Yuen from Columbia University

April 28, 2023 | Computer Science 105

Three decades of quantum complexity theory have largely focused on quantum algorithms for classical tasks -- those with classical inputs and outputs. However there has been increasing interest in studying the computational difficulty of tasks with quantum inputs and/or outputs. Examples include preparing ground states of Hamiltonians or breaking quantum cryptographic protocols. Many techniques and approaches from traditional complexity theory are inadequate for reasoning about such inherently quantum tasks, suggesting a need for a "fully quantum" complexity theory.

In this talk I discuss some facets and themes of a "fully quantum" complexity theory. In particular I show that seemingly-unrelated quantum tasks such as achieving capacity over a noisy quantum channel, breaking a quantum bit-commitment scheme, and decoding the radiation of an evaporating black hole are all polynomial-time equivalent to something called the Uhlmann Transformation Problem. I also show that the Uhlmann Transformation Problem is complete for a "unitary complexity" analogue of SZK. Along the way I describe our framework for classifying the complexity of unitary synthesis problems.

Joint work with John Bostanci, Yuval Efron, Tony Metger, Alexander Poremba, and Luowen Qian.

Estimating the Longest Increasing/Common Subsequences

Alexandr Andoni from Columbia University

May 05, 2023 | Computer Science 105

The decades-old Patience Sorting algorithm computes the Longest Increasing Subsequence (LIS) of a sequence of length n in O(n log n) time. Yet, it is unknown how well one can estimate the (length of the) LIS in *sublinear* time, say, when it is >sqrt{n}?

We present an algorithm that, given a sequence with LIS > n/k, approximates the LIS up to a factor of k^c in k*n^c time for any arbitrarily small c>0. Our run-time essentially matches the trivial sample complexity lower bound of ~k. All the past work had polynomial factors in both approximation and extra time slowdown.

An important corollary is the first algorithm to estimate the Longest *Common* Subsequence in O(n) time up to a sub-polynomial, n^o(1), factor.

Our algorithm is based on a van Emde Boas style recursion. To accomplish such a recursion for a sublinear algorithm, we develop a generic sampling scheme that allows efficient composition of recursive sampling algorithms.

Joint work with Negev Shekel Nosatzki, Sandip Sinha, and Cliff Stein.

Fall 2022 Schedule

Random restrictions on boolean functions with small influences

Pei Wu from IAS

September 16, 2022 | Friend Center Convocation Room | Talk video

In the talk, we discuss the probability of Boolean functions with small max influence to become constant under random restrictions. Let f be a Boolean function such that the variance of f is $\Omega(1)$ and all its individual influences are bounded by $\tau$. We show that when restricting all but a $\tilde{\Omega}((\log1/\tau)^{-1})$ fraction of the coordinates, the restricted function remains nonconstant with overwhelming probability. This bound is essentially optimal, as witnessed by the tribes function $AND_{n/C\log n} \circ OR_{C\log n}$.

We extend it to an anti-concentration result, showing that the restricted function has nontrivial variance with probability $1-o(1)$. This gives a sharp version of the "it ain't over till it's over" theorem due to Mossel, O'Donnell, and Oleszkiewicz.

Negative-Weight Single-Source Shortest Paths in Near-linear Time

Aaron Bernstein from Rutgers University

September 23, 2022 | Friend Center 004

We present a randomized algorithm that computes single-source shortest paths (SSSP) in $O(m\log^8(n)\log W)$ time when edge weights are integral, can be negative, and are $\geq -W$. This essentially resolves the classic negative-weight SSSP problem. The previous bounds are $\~{O}((m+n^{1.5})\log W)$ and $m^{4/3+o(1)}\log W$. Near-linear time algorithms were known previously only for the special case of planar directed graphs. Also, independently of our work, the recent breakthrough on min-cost flow [Chen, Kyng, Liu, Peng, Probst-Gutenberg and Sachdeva] implies an algorithm for negative-weight SSSP with running time $m^{1+o(1)}$.

In contrast to all recent developments that rely on sophisticated continuous optimization methods and dynamic algorithms, our algorithm is simple: it requires only a simple graph decomposition and elementary combinatorial tools. In fact, ours is the first combinatorial algorithm for negative-weight SSSP to break through the classic $O(m\sqrt{n}\log nW)$ bound from over three decades ago [Gabow and Tarjan SICOMP'89].

Kolmogorov Complexity Characterizes Statistical Zero Knowledge

Eric Allender from Rutgers University

September 30, 2022 | Friend Center 006

We show that a decidable promise problem has a non-interactive statistical zero-knowledge proof system if and only if it is randomly reducible to a promise problem for Kolmogorov-random strings, with a superlogarithmic additive approximation term. This extends recent work by Saks and Santhanam (CCC 2022). We build on this to give new characterizations of Statistical Zero Knowledge (SZK), as well as the related classes NISZK_L and SZK_L.

This is joint work with Shuichi Hirahara and Harsha Tirumala. A paper is available as ECCC TR22-127.

New Directions in Derandomization: Non-Black-Box Techniques, Superfast Algorithms

Roei Tell from IAS

October 07, 2022 | Friend Center 004

This talk will present two new directions in the study of derandomization, which are related to each other. The talk is based on a sequence of recent works with Lijie Chen (some of the works are upcoming).

The first direction is avoiding classical PRGs in favor of *non-black-box techniques*, which extract pseudorandomness for a probabilistic machine from the code of that specific machine. This approach allows constructing derandomization algorithms from weaker hypotheses (compared to classical results), and also constructing very fast derandomization algorithms that cannot be obtained using classical PRGs. The second direction is the fine-grained study of *superfast derandomization*, asking what is the smallest possible overhead that we need to pay when eliminating randomness (and whether we can get away with paying nothing at all).

As examples, we will mention four results:

1. Connecting the BPP = P conjecture to new types of lower bounds.

2. "Free lunch" derandomization with essentially no time overhead.

3. Superfast derandomization of interactive proof systems.

4. Average-case derandomization from weak hardness assumptions. (Joint work with Ron Rothblum.)

Almost Ramanujan Expanders from Arbitrary Expanders via Operator Amplification

Fernando Granha Jeronimo from IAS

October 14, 2022 | Friend Center 004

Expander graphs are fundamental objects in theoretical computer science and mathematics. They have numerous applications in diverse fields such as algorithm design, complexity theory, coding theory, pseudorandomness, group theory, etc.

In this talk, we will describe an efficient algorithm that transforms any bounded degree expander graph into another that achieves almost optimal (namely, near-quadratic, $d \leq 1/\lambda^{2+o(1)}$) trade-off between (any desired) spectral expansion $\lambda$ and degree $d$. The optimal quadratic trade-off is known as the Ramanujan bound, so our construction gives almost Ramanujan expanders from arbitrary expanders.

This transformation preserves structural properties of the original graph, and thus has many consequences. Applied to Cayley graphs, our transformation shows that any expanding finite group has almost Ramanujan expanding generators. Similarly, one can obtain almost optimal explicit constructions of quantum expanders, dimension expanders, monotone expanders, etc., from existing (suboptimal) constructions of such objects.

Our results generalize Ta-Shma's technique in his breakthrough paper [STOC 2017] used to obtain explicit almost optimal binary codes. Specifically, our spectral amplification extends Ta-Shma's analysis of bias amplification from scalars to matrices of arbitrary dimension in a very natural way.

Joint work with: Tushant Mittal, Sourya Roy and Avi Wigderson

Convex Spherical Cubes

Oded Regev from NYU

October 28, 2022 | Friend Center 004

We show that there exists a convex body that tiles $\mathbb{R}^n$ under translations by $\mathbb{Z}^n$ whose surface area is $\tilde{O}(\sqrt{n})$. No asymptotic improvement over the trivial $O(n)$ (achieved by the cube $[0,1]^n$) was previously known. A non-convex body achieving surface area $O(\sqrt{n})$ was constructed by Kindler, O’Donnell, Rao, and Wigderson [FOCS 2008].

For some fun background on this question, see Kindler et al.'s CACM article: https://cacm.acm.org/magazines/2012/10/155546-spherical-cubes/fulltext and Klarreich's broader review: https://www.jstor.org/stable/27857995.

Joint work with Assaf Naor.

Maintaining Privacy of the Minority without Ignoring It: Differentially Private Oversampling for Imbalanced Learning

Rachel Cummings from Columbia University

November 04, 2022 | Friend Center 004

The problem of learning from imbalanced datasets, where the label classes are not equally represented, arises often in practice. A widely used method to combat this problem is preprocessing techniques to amplify the minority class, e.g., by reusing or resampling the existing minority instances. However, when the training data are sensitive and privacy is a concern, this will also amplify privacy leakage from members of the minority class. Therefore this oversampling pre-processing step must be designed with privacy in mind. In this work, we present an algorithm for differentially private synthetic minority oversampling technique (DP-SMOTE) that guarantees formal privacy for the minority class. We show empirically that there is no statistically significant loss in performance relative to existing non-private SMOTE tools. We also show that even when non-private SMOTE is used as a part of a DP learning pipeline as a pre-processing step before downstream private classification, the upstream non-private oversampling increases sensitivity exponentially in the dimensionality of the data, which harms classification accuracy at a corresponding rate. The algorithmic structure also yields insights about the implications for pre-processing before the use of DP algorithms, which may be of independent interest.

(joint work with Yuliia Lut, Eitan Turok, and Marco Avella Medina)

Algorithmic Applications of Hypergraph and Partition Containers

Or Zamir from IAS and Princeton University

November 11, 2022 | Friend Center 004

The relatively new Hypergraph Container Method (Saxton and Thomason 2015, Balogh, Morris and Samotij 2015), generalizing an earlier version for graphs (Sapozhenko 2001), has been used extensively in recent years in extremal combinatroics. In essence, it shows that independent sets in regular-enough uniform hypergraphs must be clustered in some fashion. We show how to use this method algorithmically to convert algorithms for arbitrary inputs into faster algorithms for almost-regular input instances. Informally, an almost-regular input is an input in which the maximum degree is larger than the average degree by at most a constant factor. This vastly generalizes several families of inputs for which we commonly have improved algorithms.

An important component of our work is the generalization of (hyper-)graph containers to Partition Containers. While standard containers apply to a single independent set, partition containers apply to several independent sets at the same time.

Our first main application is a resolution of a major open problem about Graph Coloring algorithms, for almost-regular graphs. The chromatic number of a graph can be computed in $2^n$. Better algorithms are known only for $k<7$. We solve $k$-coloring in $(2-\epsilon)^n$ time for graphs in which the maximum degree is at most $C$ times the average degree, where $\epsilon > 0$ for any $k, C$. This includes, for example, regular graphs of any degree (with $C = 1$), and graphs with at least $\epsilon n^2$ edges for any fixed constant $\epsilon$ (with $C = 1 / 2 \epsilon$).

Our second main application is the first improved $k$-SAT algorithm for dense formulas. The celebrated sparsification lemma (Impagliazzo, Paturi and Zane 2001) implies that solving $k$-SAT can be reduced to solving $k$-SAT on sparse formulas. We show a complementing statement; For every $k, b$ there exists a $C$ such that if $k$-SAT can be solved in $b^n$ time, then $k$-SAT on formulas with at least $C \cdot n$ clauses that are well-spread can be solved in $(b-\epsilon)^n$ time.

We give additional applications including faster algorithms for unweighted and weighted Maximum Independent Set in almost-regular graphs.

Linear Probing Revisited: Tombstones Mark the Demise of Primary Clustering

William Kuszmaul from MIT

November 18, 2022 | Friend Center 004

The linear-probing hash table is one of the oldest and most widely used data structures in computer science. However, linear probing also famously comes with a major drawback: as soon as the hash table reaches a high memory utilization, elements within the hash table begin to cluster together, causing insertions to become slow. This phenomenon, now known as "primary clustering", was first captured by Donald Knuth in 1963; at a load factor of $1 - 1/x$, the expected time per insertion becomes $\Theta(x^2)$, rather than the more desirable $\Theta(x)$.

We show that there is more to the story than the classic analysis would seem to suggest. It turns out that small design decisions in how deletions are implemented have dramatic effects on the asymptotic performance of insertions. If these design decisions are made correctly, then even a hash table that is continuously at a load factor $1 - \Theta(1/x)$ can achieve average insertion time $\tilde{O}(x)$. A key insight is that the tombstones left behind by deletions cause a surprisingly strong "anti-clustering" effect, and that when insertions and deletions are one-for-one, the anti-clustering effects of deletions actually overpower the clustering effects of insertions.

We also present a new variant of linear probing, which we call "graveyard hashing", that completely eliminates primary clustering on any sequence of operations. If, when an operation is performed, the current load factor is $1 - 1/x$ for some $x$, then the expected cost of the operation is $O(x)$.

Based on joint work with Michael A. Bender and Bradley C. Kuszmaul. FOCS '21. Available at https://arxiv.org/abs/2107.01250.

1) On the Hardness of Dominant Strategy Mechanism Design; 2) Memory-Sample Lower Bounds for Learning with Classical-Quantum Hybrid Memory

Shiri Ron and Wei Zhan from Weitzmann Institute and Princeton University

December 02, 2022 | Friend 004

Talk 1: We study the communication complexity of dominant strategy implementations of combinatorial auctions. We start with two domains that are generally considered “easy”: multi-unit auctions with decreasing marginal values and combinatorial auctions with gross substitutes valuations. For both domains we have fast algorithms that find the welfare-maximizing allocation with communication complexity that is poly-logarithmic in the input size. This immediately implies that welfare maximization can be achieved in an ex-post equilibrium with no significant communication cost, by using VCG payments. In contrast, we show that in both domains the communication complexity of any dominant strategy implementation that achieves the optimal welfare is polynomial in the input size. We then study the approximation ratios achievable by dominant strategy mechanisms. For combinatorial auctions with general valuations, we show that no dominant-strategy mechanism achieves an approximation ratio of $m^{1−\eps}$, where m is the number of items. In contrast, a randomized dominant strategy mechanism that achieves an O(√m) approximation. This proves the first gap between computationally efficient deterministic dominant strategy mechanisms and randomized ones. Joint work with Shahar Dobzinski and Jan Vondrak.

Talk 2: In this talk, I will present our study on classical learning tasks in the presence of quantum memory. We prove that any quantum algorithm with both, classical memory and quantum memory, for parity learning on n bits, requires either Ω(n^2) bits of classical memory or $\Omega(n)$ bits of quantum memory or an exponential number of samples. The work by Raz showed that any classical algorithm requires either Ω(n^2) bits of memory or an exponential number of samples, and our results refute the possibility that a small amount of quantum memory significantly reduces the size of classical memory needed for efficient learning. I will briefly review the techniques used in the proof for classical memory-sample lower bound, discuss what prevents us from directly applying these techniques to the quantum case, and give a sketch of the actual proof for our results. This is a joint work with Qipeng Liu and Ran Raz.

Optimal Weak to Strong Learning

Kasper Green Larsen from Aarhus University

December 09, 2022 | CS 105

The classic algorithm AdaBoost allows to convert a weak learner, that is an algorithm that produces a hypothesis which is slightly better than chance, into a strong learner, achieving arbitrarily high accuracy when given enough training data. We present a new algorithm that constructs a strong learner from a weak learner but uses less training data than AdaBoost and all other weak to strong learners to achieve the same generalization bounds. A sample complexity lower bound shows that our new algorithm uses the minimum possible amount of training data and is thus optimal. Hence, this work settles the sample complexity of the classic problem of constructing a strong learner from a weak learner.

This work was accepted at NeurIPS’22 and is joint work with Martin Ritzert from Aarhus University

Pseudorandom rotations: explicit approximate designs for some Lie groups

Pedro Paredes from Princeton University

December 16, 2022 | CS 105

Since the introduction of randomness as an algorithmic tool, researchers have been trying to study how to reduce the amount of randomness that these probabilistic algorithms need. This is achieved through lots of different ways, but a common one is to develop algorithms that produce objects (like sequences of bits) that "look random" enough that they fool probabilistic algorithms. These algorithms that produce random looking objects are known as pseudorandom generators. In addition to the intrinsic motivation of understanding how randomness and computation interact, there is a broader appeal to using these tools to find explicit constructions of objects that are known to exist through probabilistic methods.

One of the many successes of this area is the observation that very often we only need bounded independence. Broadly speaking, sampling a truly uniform random $N$-dimensional object may require $\Omega(N)$ bits of randomness, but for many applications it is sufficient to use objects that only look truly random when inspecting $k \ll N$ dimensions, known as "$k$-wise independent". One basic example of this is $k$-wise independent permutations: a distribution over of permutations on $N$ elements is $k$-wise independent if a permutation sampled from it maps any distinct $k$ elements to any distinct $k$ elements equally likely (i.e. any subset of $k$ elements looks uniform). Thanks to Kassabov [Kas07] and Kaplan–Naor–Reingold [KNR09], it turns out that such a distribution can be created using only $O(k\log N)$ random bits to sample from (with a small error), much less than in the fully independent case.

In this talk, we present a framework that generalizes these works to obtain $k$-wise independent approximate orthogonal and unitary matrices (formally known as approximate orthogonal/unitary designs). Given a classical group of Lie type with certain properties, our framework uses the properties of the group to approximate to "$k$-locally" approximate the Haar (uniform) measure over elements of the group. We do so by first showing how to obtain a pseudorandom generator with a non-trivial yet big error, and then showing how to reduce this error using a generalization of expander random walks to operators.

Joint work with Ryan O'Donnell and Rocco Servedio

TCS @ Princeton

Copyright © 2022-24