Theory Lunch @ Princeton

Welcome to the Princeton Theory Lunch page! The Princeton TCS Seminar, fondly known as Theory Lunch, is a weekly seminar organized by the Princeton TCS group every Friday from 12:15 pm to 13:15 pm, with food usually served at 11:50 pm. It is open to all members of the Princeton community.

Talks are also simultaneously broadcasted via Zoom. The Zoom links are shared every week by email. All talks are recorded and uploaded on our YouTube channel, Princeton TCS.

This semester theory lunch is organized by Elena Gribelyuk, Pachara Sawettamalya , and Sebastian Simon.

If you'd like to be notified about future Theory Lunch talks please subscribe to the theory-read mailing list.

This Week on Theory Lunch

October 31, 2025 | CS 105

Tomer Ezra from Harvard University

Algorithmic Contract Design

We explore the framework of contract design through a computational perspective. Contract design is a fundamental pillar of microeconomics, addressing the essential question of how to incentivize individuals effectively. The significance of contract design was acknowledged by the Nobel Prize awarded to Hart and Holmström, and it applies to various real-life scenarios, such as determining bonuses for employees, setting commission structures for sales representatives, and designing payment schemes for influencers promoting products.

While contract design has been extensively studied from an economic perspective, this talk will focus on examining it from a computational viewpoint. Specifically, we introduce combinatorial extensions of classical contract design models, where a principal delegates tasks to one or multiple agents. The agents have sets of potential actions they can take to complete the task, and the success of the task is stochastically determined by the actions chosen by the agents. We analyze the structure and computational aspects of these models, and present algorithms that provide (approximately) optimal guarantees.

Fall 2025 Schedule

Additive Approximation Schemes for Low-Dimensional Embeddings

Ainesh Bakshi from New York University

September 12, 2025 | CS 201 (Tea Room)

See abstract

In this talk, I will discuss algorithms for fitting low-dimensional embeddings to high-dimensional data. In particular, I will focus on the Euclidean Metric Violation problem (EMV), where the input is an arbitrary non-negative vector in $n^2$-dimensional space and the goal is to find the closest $k$-dimensional Euclidean metric on $n$ points to this input vector. This problem was shown to be NP-Hard by Dayton and Dasgupta in 2006, even when $k=1$. Dhamdhere gave a $log(n)$-approximation when $k=1$ in 2004, and left obtaining a PTAS as an open question. Independently, the same problem has been studied for over 70 years in the statistics community, where it goes by the name of multi-dimensional scaling, and there are no approximation algorithms known for any $k>1$.

I will describe the first additive approximation scheme for EMV, which I believe is a crucial first step towards obtaining a PTAS. The key technical contribution of this work is a new analysis of correlation rounding for Sherali-Adams / Sum-of-Squares relaxations, tailored to low-dimensional embeddings. The techniques I describe are broadly applicable, and yield additive approximation schemes for weighted variants of EMV as well as entry-wise $L_p$ low-rank approximation.

Based on joint work with Prashanti Anderson and Sam Hopkins.

Differentially Private Linear Algebra

Shay Moran from Technion

September 19, 2025 | CS 105

See abstract

Differential privacy (DP) has emerged as a powerful framework for designing algorithms that protect sensitive data. In this talk, I will present our work at the intersection of differential privacy and linear algebra, introducing efficient DP algorithms for fundamental algebraic tasks: solving systems of linear equations over arbitrary fields, solving linear inequalities over the reals, and computing affine spans and convex hulls.Our algorithms for equalities are strongly polynomial, while those for inequalities are only weakly polynomial—and this gap is provably inherent.

As applications, we obtain the first efficient DP algorithms for learning halfspaces and affine subspaces. The talk will not assume prior familiarity with differential privacy; I will begin with a review of the definition.

Locally Sampleable (Uniform) Symmetric Distributions

Kewen Wu from Institute for Advanced Study

September 26, 2025 | CS 105

See abstract

We characterize the power of constant-depth Boolean circuits in generating uniform symmetric distributions. Let f be a Boolean function where each output bit of f depends only on O(1) input bits. Assume the output distribution of f on uniform input bits is close to a uniform distribution D with a symmetric support. We show that D is essentially one of the following six possibilities: (1) point distribution on the all-zero string, (2) point distribution on the all-one string, (3) uniform over all-zero or all-one, (4) uniform over strings with even Hamming weights, (5) uniform over strings with odd Hamming weights, and (6) uniform over all strings. This confirms a conjecture of Filmus, Leigh, Riazanov, and Sokolov (RANDOM 2023). Going beyond, we also obtain near-tight classification results for locally sampleable symmetric distribution and propose a conjectured full classification.

Joint works with Daniel Kane and Anthony Ostuni

Codes on high dimensional expanders

Irit Dinur from Institute for Advanced Study

October 03, 2025 | CS 105

See abstract

I will talk about constructions of error correcting codes that “sit” on high dimensional expanders, and that share some of the cool properties of Reed Muller codes, such as local testability and multiplication. These are important for secret sharing and potentially for PCPs, and I will discuss some related open questions.

Shor's Quantum Algorithms Fail in the Presence of Noise

Jin-Yi Cai from University of Wisconsin - Madison

October 10, 2025 | CS 105

See abstract

We consider Shor's quantum factoring algorithm in the setting of noisy quantum gates. We prove that the algorithm does not factor integers of the form N=pq when the noise exceeds a vanishingly small level in terms of n --- the number of bits of the integer to be factored,

where p and q are either a pair of random primes or from a well-defined set of primes of positive density. Links to the papers proving these claims are given below. The second paper is joint with Ben Young and proves the same result for Discrete Log. I will also present some experimental results supporting the hypothesis that rotations of physical qubits are not infinitely accurate, which forms the basis of the noise model for the proof. I will speculate whether quantum error correction can rescue the situation.

Vizing's Theorem in Near-Linear Time

Soheil Behnezhad from Northeastern University

October 24, 2025 | CS 105

See abstract

In his classic result, Vizing (1964) proved that any graph of maximum degree ∆ can be edge colored using at most ∆+1 different colors. Vizing’s original proof is algorithmic and takes O(mn) time where m and n denote the number of edges and vertices respectively. This was subsequently improved to O~(m \sqrt{n}) independently by [Arjomandi ’82; Gabow et al. ’85]. This bound remained state-of-the-art for four decades, and only recently got improved to O~(min{n^2, mn^{1/4}}) [Assadi ’24; Bhattacharya et al. ’24]. In this talk, I will present a completely new approach to edge coloring that leads to the first near-linear—in fact O(m log ∆)—time algorithm for Vizing’s theorem.

Joint work with Assadi, Bhattacharya, Costa, Solomon, and Zhang.

Bio:

Soheil Behnezhad is an Assistant Professor of Computer Science at Northeastern University. Before joining Northeastern, he was a Motwani Postdoctoral Fellow at Stanford University. He received his PhD from the University of Maryland and his BSc from Sharif University. His research focuses on theoretical computer science, particularly the foundations of big data algorithms---including sublinear-time, parallel, streaming, and dynamic algorithms---as well as graph algorithms. He has received Best Paper Awards at SODA 2023 and STOC 2025, and his work is supported by an NSF CAREER Award and a Google Research Award.

Algorithmic Contract Design

Tomer Ezra from Harvard University

October 31, 2025 | CS 105

See abstract

TBD

Karthik C. S. from Rutgers University

November 07, 2025 | TBD

See abstract

TBD

Erik Waingarten from University of Pennsylvania

November 14, 2025 | TBD

See abstract

TBD

Hanlin Ren from Institute for Advanced Study

November 21, 2025 | TBD

See abstract

TBA

TBD

Rahul Ilango from Institute for Advanced Study

December 05, 2025 | TBD

See abstract

TBD

James Bartusek from Columbia University

December 12, 2025 | TBD

See abstract

TBD

Past Semesters Archive

Spring 2025 Schedule

“Sharp” selector processes, random LPs and integrality gap

Huy Pham from Institute for Advanced Study

February 07, 2025 | CS 105

See abstract

Positive selector processes are natural stochastic processes driven by sparse Bernoulli random variables. They play an important role in the study of suprema of general stochastic processes, and in particular, Talagrand posed the selector process conjecture as a last missing piece in this endeavor. Interestingly Talagrand also drew a rich picture of connections between his conjecture and the threshold phenomenon. Owing to his picture, ideas behind the resolution of the selector process conjecture have played a key role in the proof of the Kahn-Kalai conjecture on the location of thresholds.

In this talk, I will discuss a quantitative sharp version of Talagrand’s selector process conjecture, which strengthens both the selector process conjecture and the Kahn-Kalai conjecture. This can be viewed as a description of the behavior of positive linear programs or fractionally subadditive functions under random sparsification. As an application, I will discuss progress towards another conjecture of Talagrand on the “integrality gap” of a wide class of integer linear programs involving “covers” of general set systems, relating the so-called expectation threshold and its fractional relaxation.

Correlation Clustering and (De)Sparsification: Graph Sketches Can Match Classical Algorithms

Sanjeev Khanna from University of Pennsylvania

February 14, 2025 | CS 105

See abstract

Correlation clustering is a widely-used approach for clustering large data sets based only on pairwise similarity information. In recent years, there has been a steady stream of better and better classical algorithms for approximating this problem. Meanwhile, another line of research has focused on porting the classical advances to various sublinear algorithm models, including semi-streaming, Massively Parallel Computation (MPC), and distributed computing. Yet, these latter works typically rely on ad-hoc approaches that do not necessarily keep up with advances in improved approximation ratios achieved by classical algorithms. This raises the following natural question: can the gains made by classical algorithms for correlation clustering be ported over to sublinear algorithms in a black-box manner? We answer this question in the affirmative by introducing the paradigm of graph de-sparsification that may be of independent interest.

This is joint work with Sepehr Assadi and Aaron (Louie) Putterman.

Conditional Calibration for Task Specific Uncertainty Quantification

Aaron Roth from University of Pennsylvania

February 21, 2025 | CS 105

See abstract

For many tasks, optimal solutions would be simple exercises if only we had direct access to some "true distribution on outcomes", conditional on all of our observed covariates. Unfortunately "true probabilities" are fundamentally impossible to come by, without making very strong modelling assumptions about the environment. On the other hand, there are various non-parametric methods for uncertainty quantification --- such as calibration and conformal prediction --- that can be used without making essentially any assumptions at all --- but their guarantees are marginal, and it is unclear what kinds of tasks they are good for. A recent line of work has given algorithms that can produce calibration guarantees that hold conditionally on any bounded set of conditioning events. This turns out to form a useful design framework for making predictions that can be used as if they were probabilities for many downstream tasks, so long as one selects the conditioning events appropriately with the downstream task in mind. We'll see three applications --- giving a method to make predictions that can be usefully consumed by many downstream decision makers, a method to make predictions that can be used to form prediction sets that are conditionally valid subject to any collection of conditioning events, and a method of making forecasts that can be used to interact with another forecaster and quickly reach agreement, recovering tractable versions of Aumann's agreement theorem.

A Theory of Generalized Boosting

Nataly Brukhim from Institute for Advanced Study

February 28, 2025 | CS 105

See abstract

Boosting is a fundamental and widely used method in machine learning, which determines that weak learnability of binary functions implies strong learnability. Traditionally, boosting theory has primarily focused on symmetric 0-1 loss functions, aligning with classical PAC learning theory, from which boosting emerged. However, its extension to more general settings, such as multiclass classification and alternative loss notions, is not straightforward. In this talk, I will discuss recent works in which we develop a comprehensive theory of boosting for a broad class of cost-sensitive and multi-objective loss functions in both the binary and multiclass setting. By leveraging a game-theoretic perspective, we uncover a more nuanced understanding of the boosting landscape. In particular, we provide a taxonomy of learning guarantees, distinguishing those that are trivial (always achievable), boostable (implying strong learning), and intermediate (implying non-trivial yet not arbitrarily accurate learning).

Beating Bellman-Ford: Faster Single-Source Shortest Paths with Negative Weights

Jeremy Fineman from Georgetown University

March 07, 2025 | CS 105

See abstract

This talk considers the most general version of the single-source shortest paths problem, where the weights on edges can be arbitrary real numbers, including both positive and negative numbers. The classical solution for general weights, often called the Bellman-Ford algorithm, was first published in the 1950s. The Bellman-Ford algorithm is simple and easy to implement, but it is slow. There has thus been significant effort in producing faster algorithms, but those results all involve restricting the problem in some way. For example, there are faster algorithms for acyclic graphs, planar graphs, nonnegative weights, and integer weights.

The general case of real weights on arbitrary directed graphs remains a challenge; despite the fact that Bellman-Ford dates back to the 1950s, no asymptotically faster algorithms had been discovered. This talk presents the first asymptotically faster SSSP algorithm for real edge weights.

How to Keep the Majority on Your Side

Kangning Wang from Rutgers University

March 21, 2025 | CS 105

See abstract

Condorcet’s paradox is a cornerstone in social choice theory. It points out that in some elections, no matter which candidate wins, a majority of voters will prefer a different candidate. In these scenarios, what can we do to guarantee the support of a majority?

Elkind, Lang, and Saffidine proposed the idea of electing multiple candidates so that a voter only prefers a different candidate if the voter prefers this candidate to every elected candidate. Under this view, we hope that the “committee” of elected candidates wins a majority support against every candidate. We can clearly guarantee this property if all the candidates are elected, but what is the minimum number of candidates that need to be elected?

This talk is based on joint work with Moses Charikar, Alexandra Lassota, Prasanna Ramakrishnan, and Adrian Vetta.

The Invisible Hand in Markets with Indivisibilities

Alexander Teytelboym from University of Oxford

March 28, 2025 | CS 105

See abstract

When goods are indivisible, markets do not always clear. Surveying recent work without assuming much background, I will give necessary and sufficient conditions for market-clearing in transferrable utility economies, economies with income effects and pseudomarkets. Equilibrium existence in all these economies is linked by surprising dualities.

Learning from Missing and Imperfect Data

Manolis Zampetakis from Yale University

April 04, 2025 | CS 105

See abstract

Positive-Unlabeled (PU) learning requires classification when only positive and unlabeled data are available, a scenario common in Bioinformatics, medical studies, and fraud detection. Its significance lies in learning from datasets where negative samples are difficult or costly to obtain. In this talk, we generalize PU learning to Positive and Imperfect Unlabeled (PIU) Learning which is a generalization of PU learning that accounts for poor-quality unlabeled data due to biases and adversarial corruption. This issue can arise due to many reasons including when we rely on public and crowdsourced sources to collect the unlabeled data.

This change in the formulation of PU learning leads to some new theoretical implications. We show how it connects to fundamental problems, such as learning from smoothed distributions, detecting data truncation, and estimation under truncation—each central to statistics and learning theory. If time permits, we will also explore how these ideas provide a new perspective on causal inference – enabling estimation in settings where standard assumptions, like overlap and unconfoundedness, break down.

Based on joined works with Jane Lee and Anay Mehrotra.

Tensor Isomorphism: complexity, algorithms, and cryptography

Youming Qiao from University of Technology Sydney

April 11, 2025 | CS 105

See abstract

Two matrices are called equivalent if one can be transformed into the other by multiplying with invertible matrices on the left and right. Extending this idea to 3-tensors, it is natural to define two 3-tensors as isomorphic if they can be transformed into one another by multiplication with three invertible matrices along the three directions.

In this talk we explore tensor isomorphism through the lenses of complexity, algorithms, and cryptography. Starting from complexity, we will examine how tensor isomorphism relates to well-known problems such as graph isomorphism, code equivalence, and group isomorphism. Unlike graph isomorphism, however, tensor isomorphism presents a significantly harder algorithmic challenge, despite some recent advances. These support the use of tensor isomorphism in cryptography, such as the security basis of some digital signature schemes.

A Combinatorial Approach to Proving Chernoff Bounds

William Kuszmaul from Carnegie Mellon University

April 18, 2025 | CS 105

See abstract

The standard proofs of Chernoff bounds are beautiful but in some ways not very intuitive. In this talk, I'll show you a different proof that has four features: (1) the proof offers a strong intuition for why Chernoff bounds look the way that they do; (2) the proof is user-friendly and (almost) algebra-free; (3) the proof comes with matching lower bounds, up to constant factors in the exponent; and (4) the proof extends to establish generalizations of Chernoff bounds in other settings.

Adventures with Sparse Noisy Linear Algebraic Assumptions

Aayush Jain from Carnegie Mellon University

April 25, 2025 | CS 105

See abstract

In this talk, we will survey recent and earlier progress on several related noisy linear algebraic assumptions with sparse coefficient vectors, including sparse Learning Parity with Noise (Alekhnovich, 2003) and sparse Learning with Errors (Jain, Lin, Saha, 2024). While sparse LPN has been studied extensively in the domain of statistical-inference and average-case complexity, surprisingly, sparse Learning with Errors was only recently introduced.

We will discuss the algorithms and complexity of these problems and their exciting cryptographic applications. The talk will also highlight important open problems in this area.

The talk is based on a number of prior works and a couple of recent works of the speaker with collaborators Quang Dao, Yuval Ishai, Huijia Lin and Sagnik Saha.

Detecting and Correcting Computationally Bounded Errors using Cryptography

Daniel Wichs from Northeastern University

May 02, 2025 | CS 105

See abstract

We study the problems of error detection and correction with the additional restriction that the channel is computationally bounded. To take advantage of this restriction, we consider seeded codes, where the polynomial-time encoding and decoding procedures share a public random seed, but are otherwise deterministic. An adversarial channel gets this seed and can perform arbitrary polynomial-time computation to adaptively select both the message to be encoded and a bounded number of errors to be added to the resulting codeword. The goal is to detect or correct such errors with overwhelming probability, while achieving better trade-offs between rate and error tolerance than those possible for computationally unbounded channels.

We give simple and optimal constructions of such codes over a large constant-sized alphabet under minimal cryptographic assumptions. We also give constructions of such codes over the binary alphabet under standard cryptographic assumptions, such as the learning with errors (LWE) assumption.

Based on joint works with Jad Silbak and George Lu

Beyond lightcone arguments in quantum circuit lower bounds

Henry Yuen from Columbia University

May 09, 2025 | CS 105

See abstract

Progress in proving quantum circuit lower bounds remains limited compared to classical circuit complexity theory. The main technique used in the quantum setting is the so-called “lightcone argument”, which leverages the fact that in a low-depth circuit involving only local gates, each output qubit can only be affected by a small number of input qubits (i.e., the qubits in its “lightcone”). However, this argument becomes ineffective when the circuit depth is large, or when there the circuit has gates involving many qubits. Developing techniques beyond the lightcone argument is an important challenge in quantum complexity theory.

In this talk, I will discuss several motivations for proving quantum circuit lower bounds coming from computer science and condensed matter physics. I will talk about progress on the question of whether the PARITY function can be computed by QAC0 circuits, which is the frontier of quantum circuit lower bounds. Finally I will talk about new directions and interesting questions.

Fall 2024 Schedule

Common Knowledge, Regained

Yannai Gonczarowski from Harvard University

September 13, 2024 | CS 201

See abstract

For common knowledge to arise in dynamic settings, all players must simultaneously come to know it has arisen. Consequently, common knowledge cannot arise in many realistic settings with timing frictions. This counterintuitive observation of Halpern and Moses (1990) was discussed by Arrow et al. (1987) and Aumann (1989), was called a paradox by Morris (2014), and has evaded satisfactory resolution for four decades. We resolve this paradox by proposing a new definition for common knowledge, which coincides with the traditional one in static settings but is more permissive in dynamic settings. Under our definition, common knowledge can arise without simultaneity, particularly in canonical examples of the Haplern-Moses paradox. We demonstrate its usefulness by deriving for it an agreement theorem à la Aumann (1976), showing it arises in the setting of Geanakoplos and Polemarchakis (1982) with timing frictions added, and applying it to characterize equilibrium behavior in a dynamic coordination game.

Two theoretical directions in AI

Elad Hazan from Princeton University

September 20, 2024 | CS 105

See abstract

I'll survey a few recent results that apply tools from theoretical computer science and game theory to AI.

First we'll describe applying tools from online learning and control theory to sequence prediction. This gives rise to the neural architecture of spectral transformers that attain SOTA performance for certain sequence prediction problems.

Next we'll discuss the AI alignment problem, and a recent formulation called the "hidden game problem". We'll discuss a few regret minimization results in this setting as well as open directions for research.

Tensor Concentration Inequalities: A Geometric Approach

Haotian Jiang from University of Chicago

September 27, 2024 | CS 105

See abstract

Matrix Concentration inequalities, commonly used in the forms of Matrix Chernoff Bounds or the Non-Commutative Khintchine Inequality, are central to a wide range of applications in computer science and mathematics. However, they fall short in many applications where tensor versions of these inequalities are required.

In this work, we study concentration inequalities for the $\ell_p$-injective norms of sums of independent tensors. We obtain the first such inequalities beyond Rudelson's classical work on rank-1 tensors, and our tensor concentration inequalities are tight in certain regimes of $p$ and the order of the tensors. Our results are obtained via a geometric argument based on estimating the covering numbers for the natural stochastic processes corresponding to tensor injective norms.

We also discuss applications and connections of our inequalities to various other problems, e.g. tensor PCA, locally-decodable codes, and natural models for random tensors and their tensor extensions.

Online Load Balancing with Samples

Anupam Gupta from New York University

October 04, 2024 | CS 105

See abstract

While analyzing algorithms in the worst-case has long served us well, recent years have seen an exciting surge in analyzing algorithms in models that go beyond the worst case. We consider the classical problem of load-balancing, where jobs arrive online and must be assigned to collections of machines to minimize the maximum load. Can we get results better than the adversarial setting if we have a small sample of the upcoming data? The results in this talk are based on work with C.J. Argue, Alan Frieze, and Chris Seiler.

Chernoff Bounds on HDX and their Applications

Max Hopkins from Princeton University

October 11, 2024 | CS 105

See abstract

Recent years have seen the emergence of high dimensional expanders (HDX) as a powerful tool in theoretical computer science, with breakthrough applications in approximate sampling and property testing. A central force behind the success of HDX in application is their concentration of measure. Consider the following basic question: given a k-uniform hypergraph X and a subset A of its vertices, how likely is a random hyperedge to `see’ A in approximately the right proportion? When X is the complete hypergraph, this question was classically resolved by Chernoff and Hoeffding, who showed X has a subgaussian tail. In 2017, Dinur and Kaufman introduced spectral HDX and proved they satisfy Chebyshev-type (inverse-polynomial) concentration not only for this basic setting, but critically for higher degree functions that sit on i-faces of X. This led to a breakthrough line of work constructing sparse agreement testers and new PCPs, but left an exponential gap with known bounds for the complete hypergraph.

In this talk, we revisit the basic notion of high dimensional expanders and prove they satisfy optimal concentration of measure, matching Chernoff-Hoeffding for the complete hypergraph. Leveraging this fact, we further prove HDX are reverse hypercontractive, a powerful analytic inequality from boolean function analysis used in the construction of (combinatorial) PCPs with optimal soundness. Time willing, we overview several applications of concentration and reverse hypercontractivity on HDX, including new agreement tests in the list-decoding regime, the construction of explicit sparse hypergraphs with optimal geometric overlap, and new degree lower bounds for HDX.

Based on joint work with Yotam Dikstein https://arxiv.org/abs/2404.10961.

Recent Developments in PCPs

Dor Minzer from Massachusetts Institute of Technology

October 25, 2024 | CS 105

See abstract

The PCP Theorem is a cornerstone of computer science, with applications in hardness of approximation, verification, interactive protocols and more.

It asserts a witness for the satisfiability of a given 3CNF formula can be encoded in a robust way that allows local checking.

In this talk, we discuss recent developments in PCPs, and in particular:

(1) constructions that achieve optimal tradeoff between the alphabet size and the soundness error, and

(2) constructions that are size efficient.

If time allows, we will also discuss connections with distributed protocols, high-dimensional expanders and discrete Fourier analysis.

Based on joint works with Kai Zhe Zheng, Mitali Bafna, Noam Lifshitz, Nikhil Vyas

Improved List Size for Folded Reed-Solomon Codes

Shashank Srivastava from DIMACS and IAS

November 01, 2024 | CS 105

See abstract

List decoding for error correcting codes was originally conceived out of a desire to correct more errors, but has since found numerous connections to other areas of theoretical CS such as pseudorandomness and complexity theory.

Folded Reed-Solomon (FRS) codes were the first explicit codes to achieve the so-called list decoding capacity. That is, they have polynomial sized lists up to the information theoretically optimal radius. However, this list size was far from optimal. Over the last 15 years, numerous attempts have been made to improve this list size, such as the use of subspace evasive sets, algebraic-geometric codes and combinatorics of Hamming balls intersecting affine subspaces.

In this work, we first give a simple proof that recovers the best known list size among explicit capacity achieving codes. Next, we use folded Wronskian determinants to prove a list size for FRS codes that is only a quadratic factor away from optimal. In contrast, all existing list size bounds were exponentially away from optimal.

Based on paper available at https://arxiv.org/abs/2410.09031 , to appear in SODA 2025.

Wake up babe, new probability just dropped

Ryan O'Donnell from Carnegie Mellon University

November 08, 2024 | CS 105

See abstract

Do you love probability? But are you sad that all the good stuff was done 100+ years ago? Like, you'll never get to be Boole (1847) inventing the Union Bound, or Chebyshev (1867) inventing Chebyshev's Inequality, or Markov (1906) inventing Markov Chains?

Well, good news! Turns out what you think of as The Laws Of Probability are just a subset of the *actual* laws that govern how random things happen in our universe. So Boole and Chebyshev and Markov were just doing special cases; the *real* Union Bound and Chebyshev Inequality and theory of Markov Chains was worked out just over the last couple of decades. I'll tell you about it, and maybe you can be the first to work out the general version of *your* favorite probability result.

Yes, this talk is about quantum computing.

Based on joint works with: Ramgopal Venkateswaran; Robin Kothari; and, Scott Aaronson, Mohammad Bavarian, Toby Cubitt, Sabee Grewal, Giulio Gueltrini, and Marien Raat.

Some lower bounds on submodular function minimization

Deeparnab Chakrabarti from Dartmouth College

November 15, 2024 | CS 105

See abstract

Submodular functions are set-functions which appear in many areas; for example, the graph cut function is a submodular function as a function of subset of vertices, and so is the “log-determinant” function as a function of rows/columns selected. It is quite a remarkable fact that one can find the unconditional minimizer of a general submodular function (called the SFM problem) in polynomially many queries, and this generalizes many fundamental combinatorial optimization problems like global minimum cut, s,t-cut, and even matroid intersection. What this polynomial dependence exactly is, is still not nailed down – it’s still between ~ n to ~n^2 (which is a big or small gap, depending on who one asks)

In this talk, I’d like to describe a few lower bounds. For most of the talk, I’ll focus on the “parallel complexity” of the problem – how many rounds-of-queries one needs to minimize a submodular function given the total number of queries is at most a polynomial? We will see that the answer is ~ n rounds! In doing so, we will also encounter the so-far best known example of query lower bound. Time permitting (perhaps not), I’d also like to talk about a few “upper bounds” for approximate SFM, and also some special cases of submodular functions (like graph cuts).

All this is based on multiple works with collaborators whose union is Yu Chen, Andrei Graur, Haotian Jiang, Sanjeev Khanna, Hang Liao and Aaron Sidford.

Grid-norm regularity for somewhat dense graphs, and some applications.

Zander Kelley from Institute for Advanced Study

November 22, 2024 | CS 105

See abstract

In 2023, Raghu Meka and I proved a substantially improved bound on the size of the smallest set of integers in [N] which contains no 3-term arithmetic progression. Since then, it has become clear that the main new ideas from that work are in fact best thought of as generic graph-theoretic tools (rather than being limited to the study of arithmetic structures). There have since been a number of new applications (for example, in communication complexity and graph algorithms) which rely on and refine these graph-theoretic tools.

In this talk, I will present a summary of these graph-theoretic tools.

In particular, I will discuss the grid-norm, a key definition which is useful for measuring the pseudorandomness of a bipartite graph.

Based on joint works with Amir Abboud, Nick Fischer, Shachar Lovett and Raghu Meka.

SoS Certificates for Sparse Singular Values and Their Applications: Robust Statistics, Subspace Distortion, and More

Sam Hopkins from Massachusetts Institute of Technology

December 06, 2024 | CS 105

See abstract

What can efficient algorithms say about the maximum sparse singular value of a random matrix? (That is, the maximum of ||Mx|| where x is a sparse vector and M is a random matrix.) This problem turns out to be at the heart of numerous well-studied algorithmic problems: sparse principal component analysis, robust mean and covariance estimation, certification of subspace distortion, and more. I will discuss a new family of polynomial-time algorithms which can certify tighter bounds than any prior algorithm on this quantity, and their applications to improve the state of the art for the aforementioned problems.

Joint work with Ilias Diakonikolas, Ankit Pensia, and Stefan Tiegel

Twist and Shout: A New Approach to Memory-Checking

Justin Thaler from Georgetown University

December 13, 2024 | CS 105

See abstract

zkVMs are SNARKs that allow an untrusted prover to prove it correctly ran a specified computer program on a witness. They promise to "democratize zero-knowledge" by allowing developers to write witness-checking programs in any high-level language, with no understanding of how the program's execution is actually proven.

However, zkVM provers today remain hundreds of thousands of times slower than native program execution, significantly limiting their applicability. A critical bottleneck lies in "memory-checking arguments": tailored SNARKs for proving correct processing of all memory reads and writes. This talk will present a new approach to memory-checking that achieves substantially better performance. I will also aim to give a broad overview of key issues and recent developments in SNARK design.

Spring 2024 Schedule

Robust sublinear expanders

Matija Bucic from Princeton University

February 06, 2024 | Friend 004

See abstract

Expander graphs are perhaps one of the most widely useful classes of graphs ever considered. In this talk, we will focus on a fairly weak notion of expanders called sublinear expanders, first introduced by Komlós and Szemerédi around 30 years ago. They have found many remarkable applications ever since. In particular, we will focus on certain robustness conditions one may impose on sublinear expanders and some applications of this idea, which include:

- recent progress on the classical Erdős-Gallai cycle decomposition conjecture,

- essentially tight answer to the classical Erdős unit distance problem for "most" real normed spaces and

- an asymptotic solution to the rainbow Turan problem for cycles, raised by Keevash, Mubayi, Sudakov and Verstraete, with an interesting corollary in additive number theory.

Based on joint works with Montgomery; Alon and Sauermann; and Alon, Sauermann, Zakharov and Zamir.

Testing Distributional Shift

Arsen Vasilyan from MIT

February 16, 2024 | Computer Science 201

See abstract

In learning theory it is commonly assumed that the training dataset comes from the same distribution as the data we see during deployment. Yet, due to a distribution shift, this assumption might not hold, which can lead to incorrect predictions. Dealing with distribution shift is a big challenge in machine learning. For example, classifiers trained on data from one hospital often fail to generalize to data from other hospitals.

In our new work, joint with Adam Klivans and Konstantinos Stavropoulos, we introduce a framework called Testable Learning with Distribution Shift (TDS learning) to address this issue. The goal of our work is developing learning algorithms that allow their user to make sure that the performance of a learning algorithm is not deteriorating due to a distribution shift.

We consider the following setting. We are given a training dataset S_1, consisting of e.g. Gaussian examples which are labelled by an unknown function f in the function class. We are also given a new dataset S_2 of unlabelled data points which we need to classify. The concern is that even if a learning algorithm produces a classifier that performs very well on the distribution from which S_1 is drawn, it might perform terribly on S_2 due to a distribution shift.

Whenever distribution shift makes it unsafe to use the classifier given by the learning algorithm on the unlabeled dataset S_2, the TDS learning framework requires the learning algorithm to alert the user. In contrast, if no distribution shift has taken place, then the TDS learning framework requires the learning algorithm not to raise such an alert. In this case, the user can be confident that the classifier given by the learning algorithm is indeed safe to deploy on the dataset S_2.

Although domain-adaptation has been studied previously, most previous work bounds the error on S_2 in terms of some notion of distance between the distributions giving rise to S_1 and S_2. These distances seem computationally difficult to evaluate, as they involve enumeration of all elements of the function class.

In our work we give TDS learning algorithms that are not only sample-efficient but also computationally efficient. We handle a number of high-dimensional function classes, including linear classifiers, intersections of linear classifiers, decision trees and low-depth formulas.

To accomplish this, we develop a novel method based on showing that functions in a given function class can be sandwiched between pairs of low degree polynomials that are close in L_2-distance. We prove the existence of such pairs of L_2-sandwiching polynomials for a wide range of function classes, by building on techniques from pseudorandomness. We then show that the existence of these L_2-sandwiching polynomials can be used to achieve efficient TDS-learning algorithms by introducing our Transfer Lemma, which allows us to ensure a good performance on the set S_2 by checking that the low-degree moments of the set S2 approximately match those of S_1.

Furthermore, for TDS learning of linear classifiers under the Gaussian distribution, we give an algorithm that does provably better than any algorithm based on L_2-sandwiching polynomials. To accomplish this, we combine the moment-matching approach with a method inspired by literature on active learning.

Randomness Extraction from Adversarial Sources

Eshan Chattopadhyay from Cornell University

February 23, 2023 | Computer Science 105

See abstract

The area of randomness extraction is concerned with designing efficient algorithms (extractors) that produce almost truly random bits from defective sources of randomness. A very well-studied model in this area is the independent source model, where the extractor has access to multiple independent sources, each of which contains some amount of entropy. In this talk, I will discuss a recent line of work that explores randomness extraction beyond the independence model, allowing some sources to be adversarial or faulty - such faulty sources may contain no entropy or have unwanted correlations. Our extractors are based on recent advances in the extractor machinery along with interesting use of tools from extremal graph theory and lower bound results in communication complexity.

Based on joint works with Jesse Goodman, Vipul Goyal, Xin Li and David Zuckerman.

Optimal Online Discrepancy Minimization

Victor Reis from Institute for Advanced Study

March 01, 2024 | Computer Science 105

See abstract

We prove that there exists an online algorithm that for any sequence of vectors v_1, ..., v_T in R^n with Euclidean norm at most 1, arriving one at a time, decides random signs x_1, ..., x_T so that for every t <= T, the prefix sum x_1 v_1 + ... + x_t v_t is 10-subgaussian.

This improves over the work of Alweiss, Liu and Sawhney who kept prefix sums O(sqrt(log(nT)))-subgaussian, and gives a O(sqrt(log T)) bound on the prefix discrepancy. Our proof combines a generalization of Banaszczyk's prefix balancing result to trees with a cloning argument to find distributions rather than single colorings. We also show a matching Omega(sqrt(log T)) strategy for an oblivious adversary. Joint work with Janardhan Kulkarni and Thomas Rothvoss to appear in STOC 2024.

Can open decentralized ledgers be economically secure?

Jacob Leshno from Chicago Booth School of Business

March 08, 2024 | Computer Science 105

See abstract

Traditional payment processors are the subject of antitrust concerns and regulations. Open decentralized ledgers (e.g., Bitcoin) provide an alternative. They do not rely on a central authority, avoiding antitrust and monopoly concerns. However, the open nature of these systems gives rise to many challenges, including fundamental questions about their security.

Using a framework that combines economic theory and distributed systems theory, we define economic security for general open decentralized ledgers.

Limitations of incentives in environments with many anonymous participants bring to question the security Bitcoin's Nakamoto protocol obtains from block rewards.

We present an alternative protocol showing that an open decentralized ledger can be economically secure.

(Joint with Rafael Pass and Elaine Shi)

Decoding Codes via Proofs

Madhur Tulsiani from TTIC

March 22, 2024 | Computer Science 105

See abstract

In this talk, we will discuss a general framework for obtaining list decoding algorithms for codes, using relaxations based on the Sum-of-Squares (SoS) hierarchy of semidefinite programs. This framework is an adaptation of the well-known “proofs to algorithms” paradigm to the setting of codes. If the proof of the fact that all pairs of codewords have large distance can be expressed in a proof system corresponding to the SoS hierarchy, then one can use it to obtain a list decoding algorithm for the corresponding code. We will discuss a few examples of this phenomenon, including expander-based LDPC codes, for which no list decoding algorithms were previously known.

Based on joint work with Fernando Granha Jeronimo, Tushant Mittal, and Shashank Srivastava.

Strong spatial mixing for colorings on trees and its algorithmic applications

Kuikui Liu from MIT

March 29, 2024 | Computer Science 105

See abstract

Correlation decay is a fundamental and important property of distributions arising in statistical physics and theoretical computer science. A longstanding conjecture is that the uniform distribution over proper q-colorings on any tree of maximum degree Δ exhibits a strong form of correlation decay whenever q≥Δ+1. It is surprising that such a basic question is still open, but then again it also highlights how much we still have to learn about random colorings. In this talk, I will discuss a near-resolution of this conjecture, as well as its algorithmic implications for sampling colorings on bounded-degree graphs via Glauber dynamics. Based on joint work with Zongchen Chen, Nitya Mani, and Ankur Moitra.

Data-Dependent LSH for the Earth Mover's Distance

Rajesh Jayaram from Google Research

April 05, 2024 | Computer Science 201

See abstract

In this talk, we discuss the first improvement in approximation for nearest neighbor search under the Earth Mover's Distance (EMD) in over a decade. Given two sets of $s$ vectors $A,B$ in high dimensional space $(R^d)$, the EMD between $A$ and $B$ is the minimum cost of a perfect matching between the vectors in $A$ and $B$ where the edge weights are given by the distances in Euclidean space. EMD is a classic metric in computer science (dating back over 100 years to the Hungarian algorithm of Jacobi), and a standard distance between two sets of points. In nearest neighbor search, one has a collection of $n$ such sets $A_1,\ldots,A_n$, which one pre-processes so that given a query set $Q$, one can quickly return a set $A_i$ whose distance (under EMD) is within a $C$-factor of the nearest neighbor to $Q$.

To date, the only algorithm with sublinear $O(n^{\epsilon})$ query time was given by Andoni, Indyk, and Krauthgamer ([AIK, SODA '08]), who designed a (data-independent) locality sensitive hash function (LSH) for EMD with approximation $O(\log^2 s)$. In this work, we improve this approximation to $\tilde{O}(\log s)$ in the same runtime by designing the first data-dependent LSH functions for EMD. The talk will discuss the main techniques behind sublinear algorithms for EMD, including the tree embeddings techniques of [AIK’08] and [Chen, Jayaram, Levi, Waingarten STOC '22], as well as the key insights needed to glue them together into a improved LSH for EMD.

Joint work with Erik Waingarten and Tian Zhang (STOC '24, https://arxiv.org/abs/2403.05041).

Gaussian Polytope Approximators

Anindya De from University of Pennsylvania

April 12, 2024 | Computer Science 105

See abstract

Abstract: We study the approximability of general convex sets in R^n

by intersections of halfspaces, where the approximation quality is

measured with respect to the standard Gaussian distribution and the

complexity of an approximation is the number of halfspaces used. While

a large body of research has considered the approximation of convex

sets by intersections of halfspaces under distance metrics such as the

Lebesgue measure and Hausdorff distance, prior to our work there has

not been a systematic study of convex approximation under the Gaussian

distribution.

We establish a range of upper and lower bounds, both for general

convex sets and for specific natural convex sets that are of

particular interest. Our results demonstrate that the landscape of

approximation is intriguingly different under the Gaussian

distribution versus previously studied distance measures. We establish

our bounds using techniques from many different areas, including

classical results from convex geometry, Cramér-type bounds from

probability theory, and—perhaps surprisingly—a range of topics from

computational complexity, including computational learning theory,

unconditional pseudorandomness, and the study of influences and noise

sensitivity in the analysis of Boolean functions.

Based on joint work with Shivam Nadimpalli and Rocco Servedio. Preprint available here: https://arxiv.org/abs/2311.08575.

Efficiently Finding Planted Cliques in Semirandom Graphs

Pravesh Kothari from Princeton University

April 19, 2024 | Computer Science 105

See abstract

In this talk, I will present a new polynomial time algorithm for recovering planted cliques in the semi-random graph model of Feige and Kilian from 2001. In the Feige-Kilian model, a graph G on vertex set V is chosen by a combination of benign random and adaptive "adversarial" choices and can be thought of as a "robust" counterpart of the well-studied "fully-random" planted clique problem:

1) choose a planted clique on a set of k vertices S,

2) "benign random choice": include each edge between S and V\S independently at random,

3) "adaptive worst-case choice": deleting any set of edges between S and V\S and adding any set of edges with both endpoints in V\S.

The previous best algorithms for this model succeed if the planted clique has a size of at least n^{2/3} in a graph with n vertices. Our algorithms work for planted-clique sizes approaching n^{1/2} -- the information-theoretic threshold in the semi-random model and a conjectured computational threshold even in the easier fully-random model. This result comes close to resolving an open question of Feige (2019) and Steinhardt (2017). Our algorithm is based on high constant degree sum-of-squares relaxation of the clique SDP and the analysis relies on a new connection between finding planted cliques and efficient certificates of upper bounds on clique number of unbalanced bicliques in bipartite random graphs.

Based on joint work with Rares Buhai and David Steurer.

Theoretical Foundations of Real-World Cryptograph

Vinod Vaikuntanathan from MIT

April 26, 2024 | Computer Science 105

See abstract

Block ciphers (aka Pseudorandom Permutations) are the workhorses of modern cryptography. An overwhelming fraction of Internet traffic is secured using the Advanced Encryption Standard (AES), a block cipher; the same goes for encrypted file systems, encrypted messaging, and several other applications we use all the time. Yet, despite the fact that the constructions are natural and elegant, we have woefully little mathematical rationale for their security. In this talk, I will describe a research program, together with a collection of recent results and techniques, aimed at achieving a better mathematical understanding of the security of real-world block ciphers including AES.

This talk is based on joint work with Angelos Pelecanos, Tianren Liu and Stefano Tessaro.

An Exponential Lower Bound for Linear 3-Query Locally Correctable Codes

Peter Manohar from Carnegie Mellon University

May 03, 2024 | Computer Science 105

See abstract

A locally correctable code (LCC) is an error correcting code where one can recover any symbol of the original codeword by querying only a small number of randomly chosen symbols from the received corrupted codeword.

In this talk, we present a proof that the blocklength n of a linear 3-query LCC C: {0,1}^k -> {0,1}^n must be at least \exp(k^{1/8}), i.e., it grows exponentially with k. This improves on the prior best lower bound of k^3 shown in our recent work, which holds even for the weaker setting of 3-query locally decodable codes (LDCs), and comes close to matching the best-known construction of 3-query LCCs based on Reed—Muller codes, which achieve a blocklength of n = \exp(\sqrt{k}). Because there is a 3-query LDC with a strictly subexponential blocklength, as a corollary we obtain the first separation between q-query LCCs and LDCs for any constant q.

We subsequently show that any “design 3-LCC”, an LCC with a parity check matrix that is a block design, must have blocklength at least 2^{(1-o(1)) sqrt{k}}. As Reed—Muller codes are design 3-LCCs with blocklength n = 2^{2 sqrt{2k}}, this is tight up to a factor of 2 sqrt{2} in the exponent, and as a corollary resolves the Hamada conjecture for 4-designs up to this constant factor. For nonlinear LCCs, we give a generalization of our approach that proves a superpolynomial lower bound of k^{log k}, improving on the prior best lower bound of k^3.

Fall 2023 Schedule

Sparsifying Generalized Linear Models

Yang Liu from Institute for Advanced Study

September 15, 2023 | Computer Science 105

See abstract

We study the sparsification of objectives $f(x) = \sum_{i=1}^m f_i()$, for vectors $a_1, ..., a_m, x \in R^n$. Informally, we show that if all the functions $f_i$ are nearly symmetric, and grow subquadratically, then $f$ has be sparsified down to $\tilde{O}(n * \epsilon^{-2})$ terms with $(1+\epsilon)$ multiplicative error. Our result generalizes the classical case of $l_p$ norm sparsification, where the functions $f_i(x) = |x|^p$ for $0 < p \le 2$. As an application, we obtain nearly-optimal algorithms to solve $l_p$-norm regression to high accuracy for $1 < p \le 2$.

Cryptographic Non-Locality

Alex Lombardi from Princeton University

September 22, 2023 | Computer Science 105

See abstract

Non-local games are an important concept in quantum information theory that allow two or more provers to demonstrate, to a classical verifier, the ability to prepare certain kinds of quantum systems. Two important applications include (multi-prover) protocols for (1) the demonstration of quantum advantage and (2) verifiable BQP computation. Key to these protocols is the physical assumption that the "far apart" provers cannot communicate.

In this talk, we will discuss how to emulate the effects of non-local games in an interactive, single-prover protocol, replacing the physical non-locality assumption with a cryptographic hardness assumption. These ideas have led to new interactive protocols for quantum advantage as well as for BQP verification.

This talk is based on joint works with Yael Kalai, Vinod Vaikuntanathan, Lisa Yang, and Fermi Ma.

2-dimensional spectral expansion of random geometric graphs

Siqi Liu from Institute for Advanced Study

September 29, 2023 | Computer Science 105

See abstract

A graph is said to be a 1-dimensional expander if the second eigenvalue of its adjacency matrix is bounded away from 1, or if it has no sparse vertex cuts. There are several natural ways to generalize the notion of expansion to hypergraphs/simplicial complexes, but one such way is 2-dimensional spectral expansion, in which the local expansion of vertex links/neighborhoods witnesses global expansion. While 1-dimensional expansion is known to be achieved by, e.g., random regular graphs, very few examples of sparse 2-dimensional expanders are known, and at present all are algebraic. It is an open question whether sparse 2-dimensional expanders are natural and abundant.

In this talk, we give some evidence towards abundance: we show that the set of triangles in a random geometric graph on a high-dimensional sphere yields an expanding simplicial complex of arbitrarily small polynomial degree.

Based on joint work with Sidhanth Mohanty, Tselil Schramm, and Elizabeth Yang.

Thin Trees for Laminar Families

Nathan Klein from Institute for Advanced Study

October 06, 2023 | Computer Science 105

See abstract

In the strong form of the thin tree conjecture, formulated by Goddyn in 2004, we are given a k-edge-connected graph and wish to find a tree containing at most an O(1/k) fraction of the edges across every cut. This beautiful conjecture, if true, has implications in a number of areas such as graph theory and approximation algorithms. A natural relaxation of this problem is to find a tree with at most O(1/k) edges across a fixed collection of cuts, instead of all cuts. Via a simple reduction technique followed by iterative relaxation, we show that the thin tree conjecture is true for laminar families of cuts. This resolves an open question of Olver and Zenklusen from 2013, a work that showed thin trees exist for chains of cuts. As a byproduct, we also obtain the first constant factor approximation algorithm for the laminar-constrained spanning tree problem. Based on joint work with Neil Olver.

Theoretical foundations for diffusion models

Sitan Chen from Harvard University

October 27, 2023 | Computer Science 105

See abstract

I will describe recent progress on providing rigorous convergence guarantees for score-based generative models (SGMs) such as denoising diffusion probabilistic models (DDPMs), which constitute the backbone of large-scale real-world generative models such as DALL⋅E. In the first part of the talk, I will show that such SGMs can efficiently sample from essentially any realistic data distribution, even ones which are highly non-log-concave. In the second part of the talk, I will show how to extend these guarantees to deterministic samplers based on discretizing the so-called probability flow ODE, which ultimately leads to better dependence on the dimension. All of these results assume access to an oracle for score estimation; time permitting, at the end I will briefly touch upon how to provably implement this oracle for interesting classes of distributions like Gaussian mixtures.

Streaming Zero-Knowledge Proofs

Marcel Dall'Agnol from Princeton University

November 03, 2023 | Computer Science 105

See abstract

We define the notion of zero-knowledge in the streaming setting and construct zero-knowledge SIPs for the two main building blocks in the streaming interactive proofs literature: the sumcheck and polynomial evaluation protocols. To the best of our knowledge all known streaming interactive proofs are based on either of these tools, and indeed, this allows us to obtain zero-knowledge SIPs for central and well-studied streaming problems, such as index, frequency moments, and inner product. Our protocols are efficient both in terms of time and space, as well as communication: the space complexity is polylog(n) and, after a non-interactive setup that uses a random string of near-linear length, the remaining parameters are sub-polynomial.

En route, we develop a toolkit for designing zero knowledge data stream protocols that may be of independent interest, consisting of an algebraic streaming commitment protocol and a temporal commitment protocol. The analysis of our protocols relies on delicate algebraic and information-theoretic arguments and reductions from average-case communication complexity.

Samplers from high dimensional expanders and agreement testing

Yotam Dikstein from Institute for Advanced Study

November 10, 2023 | Computer Science 105

See abstract

Sampler graphs are sparse graphs that "derandomize" Chernoff’s inequality. These graphs are equivalent to randomness extractors and are widely used in other areas of TCS. In this talk we will see that high dimensional expanders (HDXs) give rise to nearly optimal samplers. These are hypergraph analogues of expander graphs. Then we will see some applications such as reverse hypercontractivity inequalities on HDXs and new agreement theorems. This is a joint work with Max Hopkins (UCSD).

Testing Convex Truncation

Rocco Servedio from Columbia University

November 17, 2023 | Computer Science 105

See abstract

We study the basic statistical problem of testing whether normally distributed -dimensional data has been truncated, i.e. altered by only retaining points that lie in some unknown truncation set . As our main algorithmic results,

We give a computationally efficient -sample algorithm that can distinguish the standard normal distribution from conditioned on an unknown and arbitrary convex set .

We give a different computationally efficient -sample algorithm that can distinguish from conditioned on an unknown and arbitrary mixture of symmetric convex sets.

These results stand in sharp contrast with known results for learning or testing convex bodies with respect to the normal distribution or learning convex-truncated normal distributions, where state-of-the-art algorithms require essentially samples. An easy argument shows that no finite number of samples suffices to distinguish from an unknown and arbitrary mixture of general (not necessarily symmetric) convex sets, so no common generalization of results (1) and (2) above is possible.

We also prove lower bounds on the sample complexity of distinguishing algorithms (computationally efficient or otherwise) that match our algorithms up to constant factors.

Joint with Anindya De and Shivam Nadimpalli.

Communication Complexity of Multiparty Set Disjointness Under Product Distributions

Rotem Oshman from Tel-Aviv University

December 01, 2023 | Computer Science 105

See abstract

In the multiparty set disjointness problem, we have k players, with private inputs X1,...,Xk \subseteq [n]. The players’ goal is to check whether their sets intersect, or sometimes to compute the full intersection. It is known that in the shared blackboard model of communication, set disjointness requires Omega(n log k + k) bits of communication, and in the coordinator model, it requires Omega(nk) bits. However, these two lower bounds require that the players’ inputs can be highly correlated.

We study the communication complexity of multiparty set disjointness under product distributions, and ask whether the problem becomes significantly easier, as it is known to become in the two party case. Our main result is a nearly-tight bound of tilde-Theta(n^{1-1/k}) bits for both the shared blackboard model and the coordinator model. This shows that in the shared blackboard model, as the number of players grows, having independent inputs helps less and less; but

in the coordinator model, when k is very large, having independent inputs makes the problem much easier. Both our upper and our lower bounds use new ideas, as the original techniques developed for the two-party case do not scale to more than two players.

A new Information Complexity measure for multi-pass streaming algorithms

Sumegha Garg from Rutgers University

December 08, 2023 | Computer Science 105

See abstract

In this talk, we will introduce a new notion of information complexity for multi-pass streaming problems, and use it to prove multi-pass lower bounds for the coin problem. In the coin problem, one sees a stream of 𝑛 i.i.d. uniform bits and one would like to compute the majority with constant advantage. We show that any constant pass algorithm must use Ω(log 𝑛) bits of memory, significantly extending an earlier Ω(log 𝑛) bit lower bound for single-pass algorithms of [BGW'20]. In the paper, we also use this information complexity notion to prove tight space complexity for the needle problem, which in turn implies tight bounds for the problem of approximating higher frequency moments in a data stream.

Joint work with Mark Braverman, Qian Li, Shuo Wang, David P. Woodruff and Jiapeng Zhang.

Tight Bounds for Active lp Regression

David Woodruff from Carnegie Mellon University

December 15, 2023 | Computer Science 105

See abstract

In active learning, one seeks to learn the parameters of a model given examples, while minimizing the number of labels read. We give tight bounds up to logarithmic factors on the number of labels that need to be read for active l_p regression for every value of p, as a function of the number of variables and accuracy desired. Previously tight bounds were known only for p = 1 or p =2. We extend these ideas to a large class of loss functions with certain nice conditions, obtaining the first non-trivial bounds for such functions.

Based on joint works, one with Musco, Musco, and Yasuda (FOCS '22) and another with Yasuda (STOC '23).

Spring 2023 Schedule

Faster Walsh-Hadamard Transform from Matrix Non-Rigidity

Josh Alman from IAS

February 10, 2023 | Computer Science 105

See abstract

The Walsh-Hadamard transform is a simple recursively-defined linear transformation with many applications throughout algorithm design and complexity theory. The folklore "fast Walsh-Hadamard transform" algorithm computes it using only N log N arithmetic operations, which is believed to be optimal up to constant factors. In this talk, I'll give two improved constructions:

- A new algorithm which uses $(23/24) N log N + O(N)$ arithmetic operations. This is, to my knowledge, the first improvement of the leading constant. (Based on joint work with Kevin Rao.)

- A new depth-2 linear circuit of size $O(N^{1.45})$, improving on size $O(N^{1.5})$ which one gets from the fast Walsh-Hadamard transform approach. (Based on joint work with Ashwin Padaki and Yunfeng Guan.)

A key idea behind both constructions is to take advantage of the matrix non-rigidty of the Walsh-Hadamard transform. A matrix is called "rigid" if it's impossible to change a "small" number of its entries to make it have "low" rank. L. Valiant showed that if a matrix is rigid, then this implies lower bounds for computing its linear transformation. However, a recent line of work has shown that many matrices of interest, including the Walsh-Hadamard transform, are actually not rigid. Although a formal converse to Valiant's result is not known, we'll make use of non-rigidity in our new constructions.

A Subpolynomial Approximation Algorithm for Graph Crossing Number in Low-Degree Graphs

Zihan Tan from Rutgers / DIMACS

February 17, 2023 | Computer Science 105

See abstract

Graph Crossing Number is a fundamental and extensively studied problem with wide ranging applications. In this problem, the goal is to draw an input graph G in the plane so as to minimize the number of crossings between the images of its edges. The problem is notoriously difficult, and despite extensive work, non-trivial approximation algorithms are only known for bounded-degree graphs. Even for this special case, the best current algorithm achieves a O(n)-approximation, while the best current negative results do not rule out constant-factor approximation. All current approximation algorithms for the problem build on the same paradigm, which is also used in practice: compute a set E’of edges (called a planarizing set) such that G \ E’ is planar; compute a planar drawing of G \ E’; then add the drawings of the edges of E to the resulting drawing. Unfortunately, there are examples of graphs G, in which any implementation of this method must incur Ω(OPT2) crossings, where OPT is the value of the optimal solution. This barrier seems to doom the only currently known approach to designing approximation algorithms for the problem, and to prevent it from yielding a better than O(n)-approximation.

We propose a new paradigm that allows us to overcome this barrier. Using the new paradigm, we reduce the Crossing Number problem to Crossing Number with Rotation System – a variant of the Crossing Number problem, in which the ordering of the edges incident to every vertex is fixed as part of input. We then show a randomized algorithm for this new problem, that allows us to obtain a subpolynomial approximation for Graph Crossing Number on low-degree graphs.

This talk is based on joint work with Julia Chuzhoy and Sepideh Mahabadi.

Almost linear time algorithms for all flows

Sushant Sachdeva from University of Toronto

February 24, 2023 | Computer Science 105

See abstract

Over the last decade or so, combining methods from continuous optimization and analysis with graph theoretic-insights has led to a revolution in algorithms for classic problems on graphs such as maximum flow.

In this talk, I will present some of our key ideas behind our recent work that gives almost-linear time algorithms for solving all convex flow problems on graphs. This implies almost linear time algorithms for max-flow, minimum-cost flow, bipartite matching, optimal transport, matrix scaling, isotonic regression, and several other well-studied problems.

Our algorithm is designed using a new Interior Point Method (IPM) that builds the flow as a sequence of almost-linear number of approximate undirected minimum-ratio cycles, each of which is computed and processed very efficiently using a new dynamic data structure.

Joint work with Li Chen, Rasmus Kyng, Yang Liu, Richard Peng, and Maximilian Probst Gutenberg.

Robust and Equitable Uncertainty Estimation

Aaron Roth from University of Pennsylvania

March 03, 2023 | Friend Center 004

See abstract

Machine learning provides us with an amazing set of tools to make predictions, but how much should we trust particular predictions? To answer this, we need a way of estimating the confidence we should have in particular predictions of black-box models. Standard tools for doing this give guarantees that are averages over predictions. For instance, in a medical application, such tools might paper over poor performance on one medically relevant demographic group if it is made up for by higher performance on another group. Standard methods also depend on the data distribution being static --- in other words, the future should be like the past.

In this talk, I will describe new techniques to address both these problems: a way to produce prediction sets for arbitrary black-box prediction methods that have correct empirical coverage even when the data distribution might change in arbitrary, unanticipated ways and such that we have correct coverage even when we zoom in to focus on demographic groups that can be arbitrary and intersecting. When we just want correct group-wise coverage and are willing to assume that the future will look like the past, our algorithms are especially simple. Our approach is based on a quantile analogue of (mean) multi-calibration, a recent solution concept arising from the algorithmic fairness literature. We close by asking which distributional properties (beyond means and quantiles) we can achieve multicalibration with respect to, and provide a complete characterization via a connection to property elicitation.

This talk is based on two papers, that are joint work with Osbert Bastani, Varun Gupta, Chris Jung, Georgy Noarov, and Ramya Ramalingam.

Last-Iterate Convergence in Min-Max Optimization: SOS to the Rescue

Yang Cai from Yale University

March 10, 2023 | Computer Science 105

See abstract

Min-max optimization is a classical problem in optimization and game theory and has recently played a central role in modern machine learning (ML) applications. These applications range from generative adversarial networks, adversarial examples, robust optimization, to multi-agent reinforcement learning. These ML applications present novel challenges to min-max optimization, necessitating that algorithms (i) demonstrate last-iterate convergence, in contrast to the more traditional average-iterate convergence, and (ii) be robust in nonconvex-nonconcave environments.

In this talk, we first focus on the last-iterate convergence rates of two classical and widely-used algorithms for solving convex-concave min-max optimization – the extragradient (EG) method by Korpelevich (1976) and the optimistic gradient (OG) method by Popov (1980). Despite the considerable attention these algorithms have received from both the optimization and machine learning communities over the years, the last-iterate convergence rates for both algorithms remained open. We resolve this open problem by showing both EG and OG exhibit a tight O(1/\sqrt{T}) last-iterate convergence rate under arbitrary convex constraints. In the second part of the talk, we discuss some recent progress on designing optimal first-order methods for structured nonconvex-nonconcave min-max optimization. Our results are obtained via a new sum-of-squares programming based approach, which may be useful in analyzing other iterative methods. The talk is based on joint work with Argyris Oikonomou and Weiqiang Zheng.

1) On the Structural Complexities of Matching Mechanisms; 2) A Computational Separation Between Quantum No-cloning and No-teleportation

Clay Thomas and Barak Nehoran from Princeton University

March 24, 2023 | Computer Science 105

See abstract

Talk 1: School districts in many cities employ school choice mechanisms, i.e., algorithms matching students to schools on the basis of students’ reported preferences, and the priorities students have for attending each school. Since simplicity and ease of participation are paramount to these mechanisms, an exact understanding of their structural complexities—e.g., how the matching can be described, and how one student’s input can affect the outputs—should be a first-order concern.

In this talk, I will give one example of this type of complexity: that of describing the outcome matching. Prior work has observed that for one popular real-world mechanism (DA), the outcome can be described in terms of a single "priority cutoff" per school, even when there are many more students than there are schools. We prove that, for another popular mechanism (TTC), the outcome cannot be described using less than a bit for each pair of schools. This result clarifies and formalizes one way in which, as often informally argued in both the literature and in practice, TTC may be more complex than DA.

Depending on time, I may also mention connections to verification (in the sense of nondeterministic communication), and how we formalize the question "how sensitive is the mechanism to one student’s input" in this setting.

Based on joint work with Yannai Gonczarowski.

Talk 2: Two of the fundamental no-go theorems of quantum information are the no-cloning theorem (that it is impossible to make copies of general quantum states) and the no-teleportation theorem (the prohibition on sending quantum states over classical channels without pre-shared entanglement). They are known to be equivalent, in the sense that a collection of quantum states is clonable if and only if it is teleportable.

Our main result suggests that this is not the case when computational efficiency is considered. We give a collection of quantum states and oracles relative to which these states are efficiently clonable but not efficiently teleportable. Given that the opposite scenario is impossible (states that can be teleported can always trivially be cloned), this gives the most complete oracle separation possible between these two important no-go properties.

In doing so, we introduce a related quantum no-go property, reconstructibility, which refers to the ability to construct a quantum state from a uniquely identifying classical description. We show the stronger result of a collection of quantum states that are efficiently clonable but not efficiently reconstructible. This novel no-go property only exists in relation to computational efficiency, as it is trivial for unbounded computation. It thus opens up the possibility of further computational no-go properties that have not yet been studied because they do not exist outside the computational context.

Joint work with Mark Zhandry.

New algorithms and lower bounds for decision tree learning

Li-Yang Tan from Stanford University

April 07, 2023 | Computer Science 105

See abstract

Constructing decision tree representations of data is a basic problem of computer science, sitting right at the boundary of our understanding of efficient computation. I will discuss two recent results about this problem, a new algorithm and a new lower bound:

1. An almost-polynomial time membership query algorithm for the uniform-distribution setting. The previous fastest algorithm ran in quasipolynomial time, a consequence of a classic algorithm of Ehrenfeucht and Haussler from 1989. (https://dl.acm.org/doi/abs/10.1145/3561047)

2. A conditional quasipolynomial lower bound for the distribution-free setting, matching the upper bound of Ehrenfeucht and Haussler. This builds on and improves upon prior work by Alekhnovich, Braverman, Feldman, Klivans, and Pitassi from 2004. (https://arxiv.org/abs/2210.06375)

Joint work with Guy Blanc, Caleb Koch, Jane Lange, Mingda Qiao, and Carmen Strassle.

The Kikuchi Matrix Method

Pravesh Kothari from Carnegie Mellon University

April 14, 2023 | Friend Center 004

See abstract

In this talk, I will present a new method that reduces understanding an appropriate notion of girth in hypergraphs to unraveling the spectrum of a "Kikuchi" matrix associated with the hypergraph.

I will discuss three applications of this technique:

1) Finding a refutation algorithm for smoothed instances of constraint satisfaction problems (obtained by randomly perturbing the literal patterns in a worst-case instance with a small probability) that matches the best running-time vs constraint-density trade-offs for the significantly special and easier case of random CSPs,

2) Confirming Feige's 2008 Conjecture that postulated an extremal girth vs density trade-off (a.k.a. Moore bounds) for k-uniform hypergraphs that generalizes the Alon-Hoory-Linial Moore bound for graphs,

3) Proving a cubic lower bound on the block length of 3 query locally decodable codes improving on the prior best quadratic lower bound from the early 2000s.

Based on joint works with Omar Alrabiyah (Berkeley), Tim Hsieh (CMU), Peter Manohar (CMU), Sidhanth Mohanty (Berkeley), and Venkat Guruswami (Berkeley).

On Regularity Lemma and Barriers in Streaming and Dynamic Matching

Sanjeev Khanna from University of Pennsylvania

April 21, 2023 | Computer Science 105

See abstract

We present a new approach for finding matchings in dense graphs by building on Szemeredi's celebrated Regularity Lemma. This allows us to obtain non-trivial albeit (very) slight improvements over long-standing bounds for computing near-optimal matchings in streaming and dynamic graphs. The key idea behind our improved bounds is to maintain a matching cover which is a “sparse” subgraph that approximately preserves matchings in each induced subgraph of the input graph. However, given the use of regularity lemma, the improvement obtained by our algorithms over trivial bounds is only by some function of $(log* n)$. Nevertheless, these results show that the “right” answer to these problems is not what is dictated by the previous bounds.

This is joint work with Sepehr Assadi, Soheil Behnezhad, and Huan Li.

Unitary Complexity and the Uhlmann Transformation Problem

Henry Yuen from Columbia University

April 28, 2023 | Computer Science 105

See abstract

Three decades of quantum complexity theory have largely focused on quantum algorithms for classical tasks -- those with classical inputs and outputs. However there has been increasing interest in studying the computational difficulty of tasks with quantum inputs and/or outputs. Examples include preparing ground states of Hamiltonians or breaking quantum cryptographic protocols. Many techniques and approaches from traditional complexity theory are inadequate for reasoning about such inherently quantum tasks, suggesting a need for a "fully quantum" complexity theory.

In this talk I discuss some facets and themes of a "fully quantum" complexity theory. In particular I show that seemingly-unrelated quantum tasks such as achieving capacity over a noisy quantum channel, breaking a quantum bit-commitment scheme, and decoding the radiation of an evaporating black hole are all polynomial-time equivalent to something called the Uhlmann Transformation Problem. I also show that the Uhlmann Transformation Problem is complete for a "unitary complexity" analogue of SZK. Along the way I describe our framework for classifying the complexity of unitary synthesis problems.

Joint work with John Bostanci, Yuval Efron, Tony Metger, Alexander Poremba, and Luowen Qian.

Estimating the Longest Increasing/Common Subsequences

Alexandr Andoni from Columbia University

May 05, 2023 | Computer Science 105

See abstract

The decades-old Patience Sorting algorithm computes the Longest Increasing Subsequence (LIS) of a sequence of length n in O(n log n) time. Yet, it is unknown how well one can estimate the (length of the) LIS in *sublinear* time, say, when it is >sqrt{n}?

We present an algorithm that, given a sequence with LIS > n/k, approximates the LIS up to a factor of k^c in k*n^c time for any arbitrarily small c>0. Our run-time essentially matches the trivial sample complexity lower bound of ~k. All the past work had polynomial factors in both approximation and extra time slowdown.

An important corollary is the first algorithm to estimate the Longest *Common* Subsequence in O(n) time up to a sub-polynomial, n^o(1), factor.

Our algorithm is based on a van Emde Boas style recursion. To accomplish such a recursion for a sublinear algorithm, we develop a generic sampling scheme that allows efficient composition of recursive sampling algorithms.

Joint work with Negev Shekel Nosatzki, Sandip Sinha, and Cliff Stein.

Fall 2022 Schedule

Random restrictions on boolean functions with small influences

Pei Wu from IAS

September 16, 2022 | Friend Center Convocation Room | Talk video

See abstract

In the talk, we discuss the probability of Boolean functions with small max influence to become constant under random restrictions. Let f be a Boolean function such that the variance of f is $\Omega(1)$ and all its individual influences are bounded by $\tau$. We show that when restricting all but a $\tilde{\Omega}((\log1/\tau)^{-1})$ fraction of the coordinates, the restricted function remains nonconstant with overwhelming probability. This bound is essentially optimal, as witnessed by the tribes function $AND_{n/C\log n} \circ OR_{C\log n}$.

We extend it to an anti-concentration result, showing that the restricted function has nontrivial variance with probability $1-o(1)$. This gives a sharp version of the "it ain't over till it's over" theorem due to Mossel, O'Donnell, and Oleszkiewicz.

Negative-Weight Single-Source Shortest Paths in Near-linear Time

Aaron Bernstein from Rutgers University

September 23, 2022 | Friend Center 004

See abstract

We present a randomized algorithm that computes single-source shortest paths (SSSP) in $O(m\log^8(n)\log W)$ time when edge weights are integral, can be negative, and are $\geq -W$. This essentially resolves the classic negative-weight SSSP problem. The previous bounds are $\~{O}((m+n^{1.5})\log W)$ and $m^{4/3+o(1)}\log W$. Near-linear time algorithms were known previously only for the special case of planar directed graphs. Also, independently of our work, the recent breakthrough on min-cost flow [Chen, Kyng, Liu, Peng, Probst-Gutenberg and Sachdeva] implies an algorithm for negative-weight SSSP with running time $m^{1+o(1)}$.

In contrast to all recent developments that rely on sophisticated continuous optimization methods and dynamic algorithms, our algorithm is simple: it requires only a simple graph decomposition and elementary combinatorial tools. In fact, ours is the first combinatorial algorithm for negative-weight SSSP to break through the classic $O(m\sqrt{n}\log nW)$ bound from over three decades ago [Gabow and Tarjan SICOMP'89].

Kolmogorov Complexity Characterizes Statistical Zero Knowledge

Eric Allender from Rutgers University

September 30, 2022 | Friend Center 006

See abstract

We show that a decidable promise problem has a non-interactive statistical zero-knowledge proof system if and only if it is randomly reducible to a promise problem for Kolmogorov-random strings, with a superlogarithmic additive approximation term. This extends recent work by Saks and Santhanam (CCC 2022). We build on this to give new characterizations of Statistical Zero Knowledge (SZK), as well as the related classes NISZK_L and SZK_L.

This is joint work with Shuichi Hirahara and Harsha Tirumala. A paper is available as ECCC TR22-127.

New Directions in Derandomization: Non-Black-Box Techniques, Superfast Algorithms

Roei Tell from IAS

October 07, 2022 | Friend Center 004

See abstract

This talk will present two new directions in the study of derandomization, which are related to each other. The talk is based on a sequence of recent works with Lijie Chen (some of the works are upcoming).

The first direction is avoiding classical PRGs in favor of *non-black-box techniques*, which extract pseudorandomness for a probabilistic machine from the code of that specific machine. This approach allows constructing derandomization algorithms from weaker hypotheses (compared to classical results), and also constructing very fast derandomization algorithms that cannot be obtained using classical PRGs. The second direction is the fine-grained study of *superfast derandomization*, asking what is the smallest possible overhead that we need to pay when eliminating randomness (and whether we can get away with paying nothing at all).

As examples, we will mention four results:

1. Connecting the BPP = P conjecture to new types of lower bounds.

2. "Free lunch" derandomization with essentially no time overhead.

3. Superfast derandomization of interactive proof systems.

4. Average-case derandomization from weak hardness assumptions. (Joint work with Ron Rothblum.)

Almost Ramanujan Expanders from Arbitrary Expanders via Operator Amplification

Fernando Granha Jeronimo from IAS

October 14, 2022 | Friend Center 004

See abstract

Expander graphs are fundamental objects in theoretical computer science and mathematics. They have numerous applications in diverse fields such as algorithm design, complexity theory, coding theory, pseudorandomness, group theory, etc.

In this talk, we will describe an efficient algorithm that transforms any bounded degree expander graph into another that achieves almost optimal (namely, near-quadratic, $d \leq 1/\lambda^{2+o(1)}$) trade-off between (any desired) spectral expansion $\lambda$ and degree $d$. The optimal quadratic trade-off is known as the Ramanujan bound, so our construction gives almost Ramanujan expanders from arbitrary expanders.

This transformation preserves structural properties of the original graph, and thus has many consequences. Applied to Cayley graphs, our transformation shows that any expanding finite group has almost Ramanujan expanding generators. Similarly, one can obtain almost optimal explicit constructions of quantum expanders, dimension expanders, monotone expanders, etc., from existing (suboptimal) constructions of such objects.

Our results generalize Ta-Shma's technique in his breakthrough paper [STOC 2017] used to obtain explicit almost optimal binary codes. Specifically, our spectral amplification extends Ta-Shma's analysis of bias amplification from scalars to matrices of arbitrary dimension in a very natural way.

Joint work with: Tushant Mittal, Sourya Roy and Avi Wigderson

Convex Spherical Cubes

Oded Regev from NYU

October 28, 2022 | Friend Center 004

See abstract

We show that there exists a convex body that tiles $\mathbb{R}^n$ under translations by $\mathbb{Z}^n$ whose surface area is $\tilde{O}(\sqrt{n})$. No asymptotic improvement over the trivial $O(n)$ (achieved by the cube $[0,1]^n$) was previously known. A non-convex body achieving surface area $O(\sqrt{n})$ was constructed by Kindler, O’Donnell, Rao, and Wigderson [FOCS 2008].

For some fun background on this question, see Kindler et al.'s CACM article: https://cacm.acm.org/magazines/2012/10/155546-spherical-cubes/fulltext and Klarreich's broader review: https://www.jstor.org/stable/27857995.

Joint work with Assaf Naor.

Maintaining Privacy of the Minority without Ignoring It: Differentially Private Oversampling for Imbalanced Learning

Rachel Cummings from Columbia University

November 04, 2022 | Friend Center 004

See abstract

The problem of learning from imbalanced datasets, where the label classes are not equally represented, arises often in practice. A widely used method to combat this problem is preprocessing techniques to amplify the minority class, e.g., by reusing or resampling the existing minority instances. However, when the training data are sensitive and privacy is a concern, this will also amplify privacy leakage from members of the minority class. Therefore this oversampling pre-processing step must be designed with privacy in mind. In this work, we present an algorithm for differentially private synthetic minority oversampling technique (DP-SMOTE) that guarantees formal privacy for the minority class. We show empirically that there is no statistically significant loss in performance relative to existing non-private SMOTE tools. We also show that even when non-private SMOTE is used as a part of a DP learning pipeline as a pre-processing step before downstream private classification, the upstream non-private oversampling increases sensitivity exponentially in the dimensionality of the data, which harms classification accuracy at a corresponding rate. The algorithmic structure also yields insights about the implications for pre-processing before the use of DP algorithms, which may be of independent interest.

(joint work with Yuliia Lut, Eitan Turok, and Marco Avella Medina)

Algorithmic Applications of Hypergraph and Partition Containers

Or Zamir from IAS and Princeton University

November 11, 2022 | Friend Center 004

See abstract

The relatively new Hypergraph Container Method (Saxton and Thomason 2015, Balogh, Morris and Samotij 2015), generalizing an earlier version for graphs (Sapozhenko 2001), has been used extensively in recent years in extremal combinatroics. In essence, it shows that independent sets in regular-enough uniform hypergraphs must be clustered in some fashion. We show how to use this method algorithmically to convert algorithms for arbitrary inputs into faster algorithms for almost-regular input instances. Informally, an almost-regular input is an input in which the maximum degree is larger than the average degree by at most a constant factor. This vastly generalizes several families of inputs for which we commonly have improved algorithms.

An important component of our work is the generalization of (hyper-)graph containers to Partition Containers. While standard containers apply to a single independent set, partition containers apply to several independent sets at the same time.

Our first main application is a resolution of a major open problem about Graph Coloring algorithms, for almost-regular graphs. The chromatic number of a graph can be computed in $2^n$. Better algorithms are known only for $k<7$. We solve $k$-coloring in $(2-\epsilon)^n$ time for graphs in which the maximum degree is at most $C$ times the average degree, where $\epsilon> 0$ for any $k, C$. This includes, for example, regular graphs of any degree (with $C = 1$), and graphs with at least $\epsilon n^2$ edges for any fixed constant $\epsilon$ (with $C = 1 / 2 \epsilon$).

Our second main application is the first improved $k$-SAT algorithm for dense formulas. The celebrated sparsification lemma (Impagliazzo, Paturi and Zane 2001) implies that solving $k$-SAT can be reduced to solving $k$-SAT on sparse formulas. We show a complementing statement; For every $k, b$ there exists a $C$ such that if $k$-SAT can be solved in $b^n$ time, then $k$-SAT on formulas with at least $C \cdot n$ clauses that are well-spread can be solved in $(b-\epsilon)^n$ time.

We give additional applications including faster algorithms for unweighted and weighted Maximum Independent Set in almost-regular graphs.

Linear Probing Revisited: Tombstones Mark the Demise of Primary Clustering

William Kuszmaul from MIT

November 18, 2022 | Friend Center 004

See abstract

The linear-probing hash table is one of the oldest and most widely used data structures in computer science. However, linear probing also famously comes with a major drawback: as soon as the hash table reaches a high memory utilization, elements within the hash table begin to cluster together, causing insertions to become slow. This phenomenon, now known as "primary clustering", was first captured by Donald Knuth in 1963; at a load factor of $1 - 1/x$, the expected time per insertion becomes $\Theta(x^2)$, rather than the more desirable $\Theta(x)$.

We show that there is more to the story than the classic analysis would seem to suggest. It turns out that small design decisions in how deletions are implemented have dramatic effects on the asymptotic performance of insertions. If these design decisions are made correctly, then even a hash table that is continuously at a load factor $1 - \Theta(1/x)$ can achieve average insertion time $\tilde{O}(x)$. A key insight is that the tombstones left behind by deletions cause a surprisingly strong "anti-clustering" effect, and that when insertions and deletions are one-for-one, the anti-clustering effects of deletions actually overpower the clustering effects of insertions.

We also present a new variant of linear probing, which we call "graveyard hashing", that completely eliminates primary clustering on any sequence of operations. If, when an operation is performed, the current load factor is $1 - 1/x$ for some $x$, then the expected cost of the operation is $O(x)$.

Based on joint work with Michael A. Bender and Bradley C. Kuszmaul. FOCS '21. Available at https://arxiv.org/abs/2107.01250.

1) On the Hardness of Dominant Strategy Mechanism Design; 2) Memory-Sample Lower Bounds for Learning with Classical-Quantum Hybrid Memory

Shiri Ron and Wei Zhan from Weitzmann Institute and Princeton University

December 02, 2022 | Friend 004

See abstract

Talk 1: We study the communication complexity of dominant strategy implementations of combinatorial auctions. We start with two domains that are generally considered “easy”: multi-unit auctions with decreasing marginal values and combinatorial auctions with gross substitutes valuations. For both domains we have fast algorithms that find the welfare-maximizing allocation with communication complexity that is poly-logarithmic in the input size. This immediately implies that welfare maximization can be achieved in an ex-post equilibrium with no significant communication cost, by using VCG payments. In contrast, we show that in both domains the communication complexity of any dominant strategy implementation that achieves the optimal welfare is polynomial in the input size. We then study the approximation ratios achievable by dominant strategy mechanisms. For combinatorial auctions with general valuations, we show that no dominant-strategy mechanism achieves an approximation ratio of $m^{1−\eps}$, where m is the number of items. In contrast, a randomized dominant strategy mechanism that achieves an O(√m) approximation. This proves the first gap between computationally efficient deterministic dominant strategy mechanisms and randomized ones. Joint work with Shahar Dobzinski and Jan Vondrak.

Talk 2: In this talk, I will present our study on classical learning tasks in the presence of quantum memory. We prove that any quantum algorithm with both, classical memory and quantum memory, for parity learning on n bits, requires either Ω(n^2) bits of classical memory or $\Omega(n)$ bits of quantum memory or an exponential number of samples. The work by Raz showed that any classical algorithm requires either Ω(n^2) bits of memory or an exponential number of samples, and our results refute the possibility that a small amount of quantum memory significantly reduces the size of classical memory needed for efficient learning. I will briefly review the techniques used in the proof for classical memory-sample lower bound, discuss what prevents us from directly applying these techniques to the quantum case, and give a sketch of the actual proof for our results. This is a joint work with Qipeng Liu and Ran Raz.

Optimal Weak to Strong Learning

Kasper Green Larsen from Aarhus University

December 09, 2022 | CS 105

See abstract

The classic algorithm AdaBoost allows to convert a weak learner, that is an algorithm that produces a hypothesis which is slightly better than chance, into a strong learner, achieving arbitrarily high accuracy when given enough training data. We present a new algorithm that constructs a strong learner from a weak learner but uses less training data than AdaBoost and all other weak to strong learners to achieve the same generalization bounds. A sample complexity lower bound shows that our new algorithm uses the minimum possible amount of training data and is thus optimal. Hence, this work settles the sample complexity of the classic problem of constructing a strong learner from a weak learner.

This work was accepted at NeurIPS’22 and is joint work with Martin Ritzert from Aarhus University

Pseudorandom rotations: explicit approximate designs for some Lie groups

Pedro Paredes from Princeton University

December 16, 2022 | CS 105

See abstract

Since the introduction of randomness as an algorithmic tool, researchers have been trying to study how to reduce the amount of randomness that these probabilistic algorithms need. This is achieved through lots of different ways, but a common one is to develop algorithms that produce objects (like sequences of bits) that "look random" enough that they fool probabilistic algorithms. These algorithms that produce random looking objects are known as pseudorandom generators. In addition to the intrinsic motivation of understanding how randomness and computation interact, there is a broader appeal to using these tools to find explicit constructions of objects that are known to exist through probabilistic methods.

One of the many successes of this area is the observation that very often we only need bounded independence. Broadly speaking, sampling a truly uniform random $N$-dimensional object may require $\Omega(N)$ bits of randomness, but for many applications it is sufficient to use objects that only look truly random when inspecting $k \ll N$ dimensions, known as "$k$-wise independent". One basic example of this is $k$-wise independent permutations: a distribution over of permutations on $N$ elements is $k$-wise independent if a permutation sampled from it maps any distinct $k$ elements to any distinct $k$ elements equally likely (i.e. any subset of $k$ elements looks uniform). Thanks to Kassabov [Kas07] and Kaplan–Naor–Reingold [KNR09], it turns out that such a distribution can be created using only $O(k\log N)$ random bits to sample from (with a small error), much less than in the fully independent case.

In this talk, we present a framework that generalizes these works to obtain $k$-wise independent approximate orthogonal and unitary matrices (formally known as approximate orthogonal/unitary designs). Given a classical group of Lie type with certain properties, our framework uses the properties of the group to approximate to "$k$-locally" approximate the Haar (uniform) measure over elements of the group. We do so by first showing how to obtain a pseudorandom generator with a non-trivial yet big error, and then showing how to reduce this error using a generalization of expander random walks to operators.

Joint work with Ryan O'Donnell and Rocco Servedio

This Week on Theory Lunch

October 31, 2025 | CS 105

Tomer Ezra from Harvard University

Algorithmic Contract Design

Fall 2025 Schedule

Past Semesters Archive

Spring 2025 Schedule

Fall 2024 Schedule

Spring 2024 Schedule

Fall 2023 Schedule

Spring 2023 Schedule

Fall 2022 Schedule

TCS @ Princeton