Welcome to the Princeton Theory Lunch page! The Princeton TCS Seminar, fondly known as Theory Lunch, is a weekly seminar organized by the Princeton TCS group every Friday from 12:15 pm to 13:15 pm, with food usually served at 11:50 pm. It is open to all members of the Princeton community.
Talks are also simultaneously broadcasted via Zoom. The Zoom links are shared every week by email. All talks are recorded and uploaded on our YouTube channel, Princeton TCS.
This semester theory lunch is organized by Kaya Alpturer, Zhongtian He, and Yaxin Tu.
If you'd like to be notified about future Theory Lunch talks please subscribe to the theory-read mailing list.
This Week on Theory Lunch
December 13, 2024 | CS 105
Justin Thaler from Georgetown University
Twist and Shout: A New Approach to Memory-Checking
zkVMs are SNARKs that allow an untrusted prover to prove it correctly ran a specified computer program on a witness. They promise to "democratize zero-knowledge" by allowing developers to write witness-checking programs in any high-level language, with no understanding of how the program's execution is actually proven.
However, zkVM provers today remain hundreds of thousands of times slower than native program execution, significantly limiting their applicability. A critical bottleneck lies in "memory-checking arguments": tailored SNARKs for proving correct processing of all memory reads and writes. This talk will present a new approach to memory-checking that achieves substantially better performance. I will also aim to give a broad overview of key issues and recent developments in SNARK design.
Fall 2024 Schedule
Common Knowledge, Regained
Yannai Gonczarowski from Harvard University
September 13, 2024 | CS 201
For common knowledge to arise in dynamic settings, all players must simultaneously come to know it has arisen. Consequently, common knowledge cannot arise in many realistic settings with timing frictions. This counterintuitive observation of Halpern and Moses (1990) was discussed by Arrow et al. (1987) and Aumann (1989), was called a paradox by Morris (2014), and has evaded satisfactory resolution for four decades. We resolve this paradox by proposing a new definition for common knowledge, which coincides with the traditional one in static settings but is more permissive in dynamic settings. Under our definition, common knowledge can arise without simultaneity, particularly in canonical examples of the Haplern-Moses paradox. We demonstrate its usefulness by deriving for it an agreement theorem à la Aumann (1976), showing it arises in the setting of Geanakoplos and Polemarchakis (1982) with timing frictions added, and applying it to characterize equilibrium behavior in a dynamic coordination game.
Two theoretical directions in AI
Elad Hazan from Princeton University
September 20, 2024 | CS 105
I'll survey a few recent results that apply tools from theoretical computer science and game theory to AI.
First we'll describe applying tools from online learning and control theory to sequence prediction. This gives rise to the neural architecture of spectral transformers that attain SOTA performance for certain sequence prediction problems.
Next we'll discuss the AI alignment problem, and a recent formulation called the "hidden game problem". We'll discuss a few regret minimization results in this setting as well as open directions for research.
Tensor Concentration Inequalities: A Geometric Approach
Haotian Jiang from University of Chicago
September 27, 2024 | CS 105
Matrix Concentration inequalities, commonly used in the forms of Matrix Chernoff Bounds or the Non-Commutative Khintchine Inequality, are central to a wide range of applications in computer science and mathematics. However, they fall short in many applications where tensor versions of these inequalities are required.
In this work, we study concentration inequalities for the $\ell_p$-injective norms of sums of independent tensors. We obtain the first such inequalities beyond Rudelson's classical work on rank-1 tensors, and our tensor concentration inequalities are tight in certain regimes of $p$ and the order of the tensors. Our results are obtained via a geometric argument based on estimating the covering numbers for the natural stochastic processes corresponding to tensor injective norms.
We also discuss applications and connections of our inequalities to various other problems, e.g. tensor PCA, locally-decodable codes, and natural models for random tensors and their tensor extensions.
Online Load Balancing with Samples
Anupam Gupta from New York University
October 04, 2024 | CS 105
While analyzing algorithms in the worst-case has long served us well, recent years have seen an exciting surge in analyzing algorithms in models that go beyond the worst case. We consider the classical problem of load-balancing, where jobs arrive online and must be assigned to collections of machines to minimize the maximum load. Can we get results better than the adversarial setting if we have a small sample of the upcoming data? The results in this talk are based on work with C.J. Argue, Alan Frieze, and Chris Seiler.
Chernoff Bounds on HDX and their Applications
Max Hopkins from Princeton University
October 11, 2024 | CS 105
Recent years have seen the emergence of high dimensional expanders (HDX) as a powerful tool in theoretical computer science, with breakthrough applications in approximate sampling and property testing. A central force behind the success of HDX in application is their concentration of measure. Consider the following basic question: given a k-uniform hypergraph X and a subset A of its vertices, how likely is a random hyperedge to `see’ A in approximately the right proportion? When X is the complete hypergraph, this question was classically resolved by Chernoff and Hoeffding, who showed X has a subgaussian tail. In 2017, Dinur and Kaufman introduced spectral HDX and proved they satisfy Chebyshev-type (inverse-polynomial) concentration not only for this basic setting, but critically for higher degree functions that sit on i-faces of X. This led to a breakthrough line of work constructing sparse agreement testers and new PCPs, but left an exponential gap with known bounds for the complete hypergraph.
In this talk, we revisit the basic notion of high dimensional expanders and prove they satisfy optimal concentration of measure, matching Chernoff-Hoeffding for the complete hypergraph. Leveraging this fact, we further prove HDX are reverse hypercontractive, a powerful analytic inequality from boolean function analysis used in the construction of (combinatorial) PCPs with optimal soundness. Time willing, we overview several applications of concentration and reverse hypercontractivity on HDX, including new agreement tests in the list-decoding regime, the construction of explicit sparse hypergraphs with optimal geometric overlap, and new degree lower bounds for HDX.
Based on joint work with Yotam Dikstein https://arxiv.org/abs/2404.10961.
Recent Developments in PCPs
Dor Minzer from Massachusetts Institute of Technology
October 25, 2024 | CS 105
The PCP Theorem is a cornerstone of computer science, with applications in hardness of approximation, verification, interactive protocols and more.
It asserts a witness for the satisfiability of a given 3CNF formula can be encoded in a robust way that allows local checking.
In this talk, we discuss recent developments in PCPs, and in particular:
(1) constructions that achieve optimal tradeoff between the alphabet size and the soundness error, and
(2) constructions that are size efficient.
If time allows, we will also discuss connections with distributed protocols, high-dimensional expanders and discrete Fourier analysis.
Based on joint works with Kai Zhe Zheng, Mitali Bafna, Noam Lifshitz, Nikhil Vyas
Improved List Size for Folded Reed-Solomon Codes
Shashank Srivastava from DIMACS and IAS
November 01, 2024 | CS 105
List decoding for error correcting codes was originally conceived out of a desire to correct more errors, but has since found numerous connections to other areas of theoretical CS such as pseudorandomness and complexity theory.
Folded Reed-Solomon (FRS) codes were the first explicit codes to achieve the so-called list decoding capacity. That is, they have polynomial sized lists up to the information theoretically optimal radius. However, this list size was far from optimal. Over the last 15 years, numerous attempts have been made to improve this list size, such as the use of subspace evasive sets, algebraic-geometric codes and combinatorics of Hamming balls intersecting affine subspaces.
In this work, we first give a simple proof that recovers the best known list size among explicit capacity achieving codes. Next, we use folded Wronskian determinants to prove a list size for FRS codes that is only a quadratic factor away from optimal. In contrast, all existing list size bounds were exponentially away from optimal.
Based on paper available at https://arxiv.org/abs/2410.09031 , to appear in SODA 2025.
Wake up babe, new probability just dropped
Ryan O'Donnell from Carnegie Mellon University
November 08, 2024 | CS 105
Do you love probability? But are you sad that all the good stuff was done 100+ years ago? Like, you'll never get to be Boole (1847) inventing the Union Bound, or Chebyshev (1867) inventing Chebyshev's Inequality, or Markov (1906) inventing Markov Chains?
Well, good news! Turns out what you think of as The Laws Of Probability are just a subset of the *actual* laws that govern how random things happen in our universe. So Boole and Chebyshev and Markov were just doing special cases; the *real* Union Bound and Chebyshev Inequality and theory of Markov Chains was worked out just over the last couple of decades. I'll tell you about it, and maybe you can be the first to work out the general version of *your* favorite probability result.
Yes, this talk is about quantum computing.
Based on joint works with: Ramgopal Venkateswaran; Robin Kothari; and, Scott Aaronson, Mohammad Bavarian, Toby Cubitt, Sabee Grewal, Giulio Gueltrini, and Marien Raat.
Some lower bounds on submodular function minimization
Deeparnab Chakrabarti from Dartmouth College
November 15, 2024 | CS 105
Submodular functions are set-functions which appear in many areas; for example, the graph cut function is a submodular function as a function of subset of vertices, and so is the “log-determinant” function as a function of rows/columns selected. It is quite a remarkable fact that one can find the unconditional minimizer of a general submodular function (called the SFM problem) in polynomially many queries, and this generalizes many fundamental combinatorial optimization problems like global minimum cut, s,t-cut, and even matroid intersection. What this polynomial dependence exactly is, is still not nailed down – it’s still between ~ n to ~n^2 (which is a big or small gap, depending on who one asks)
In this talk, I’d like to describe a few lower bounds. For most of the talk, I’ll focus on the “parallel complexity” of the problem – how many rounds-of-queries one needs to minimize a submodular function given the total number of queries is at most a polynomial? We will see that the answer is ~ n rounds! In doing so, we will also encounter the so-far best known example of query lower bound. Time permitting (perhaps not), I’d also like to talk about a few “upper bounds” for approximate SFM, and also some special cases of submodular functions (like graph cuts).
All this is based on multiple works with collaborators whose union is Yu Chen, Andrei Graur, Haotian Jiang, Sanjeev Khanna, Hang Liao and Aaron Sidford.
Grid-norm regularity for somewhat dense graphs, and some applications.
Zander Kelley from Institute for Advanced Study
November 22, 2024 | CS 105
In 2023, Raghu Meka and I proved a substantially improved bound on the size of the smallest set of integers in [N] which contains no 3-term arithmetic progression. Since then, it has become clear that the main new ideas from that work are in fact best thought of as generic graph-theoretic tools (rather than being limited to the study of arithmetic structures). There have since been a number of new applications (for example, in communication complexity and graph algorithms) which rely on and refine these graph-theoretic tools.
In this talk, I will present a summary of these graph-theoretic tools.
In particular, I will discuss the grid-norm, a key definition which is useful for measuring the pseudorandomness of a bipartite graph.
Based on joint works with Amir Abboud, Nick Fischer, Shachar Lovett and Raghu Meka.
SoS Certificates for Sparse Singular Values and Their Applications: Robust Statistics, Subspace Distortion, and More
Sam Hopkins from Massachusetts Institute of Technology
December 06, 2024 | CS 105
What can efficient algorithms say about the maximum sparse singular value of a random matrix? (That is, the maximum of ||Mx|| where x is a sparse vector and M is a random matrix.) This problem turns out to be at the heart of numerous well-studied algorithmic problems: sparse principal component analysis, robust mean and covariance estimation, certification of subspace distortion, and more. I will discuss a new family of polynomial-time algorithms which can certify tighter bounds than any prior algorithm on this quantity, and their applications to improve the state of the art for the aforementioned problems.
Joint work with Ilias Diakonikolas, Ankit Pensia, and Stefan Tiegel
Twist and Shout: A New Approach to Memory-Checking
Justin Thaler from Georgetown University
December 13, 2024 | CS 105
zkVMs are SNARKs that allow an untrusted prover to prove it correctly ran a specified computer program on a witness. They promise to "democratize zero-knowledge" by allowing developers to write witness-checking programs in any high-level language, with no understanding of how the program's execution is actually proven.
However, zkVM provers today remain hundreds of thousands of times slower than native program execution, significantly limiting their applicability. A critical bottleneck lies in "memory-checking arguments": tailored SNARKs for proving correct processing of all memory reads and writes. This talk will present a new approach to memory-checking that achieves substantially better performance. I will also aim to give a broad overview of key issues and recent developments in SNARK design.
Past Semesters Archive
Spring 2024 Schedule
Robust sublinear expanders
Matija Bucic from Princeton University
February 06, 2024 | Friend 004
Expander graphs are perhaps one of the most widely useful classes of graphs ever considered. In this talk, we will focus on a fairly weak notion of expanders called sublinear expanders, first introduced by Komlós and Szemerédi around 30 years ago. They have found many remarkable applications ever since. In particular, we will focus on certain robustness conditions one may impose on sublinear expanders and some applications of this idea, which include:
- recent progress on the classical Erdős-Gallai cycle decomposition conjecture,
- essentially tight answer to the classical Erdős unit distance problem for "most" real normed spaces and
- an asymptotic solution to the rainbow Turan problem for cycles, raised by Keevash, Mubayi, Sudakov and Verstraete, with an interesting corollary in additive number theory.
Based on joint works with Montgomery; Alon and Sauermann; and Alon, Sauermann, Zakharov and Zamir.
Testing Distributional Shift
Arsen Vasilyan from MIT
February 16, 2024 | Computer Science 201
In learning theory it is commonly assumed that the training dataset comes from the same distribution as the data we see during deployment. Yet, due to a distribution shift, this assumption might not hold, which can lead to incorrect predictions. Dealing with distribution shift is a big challenge in machine learning. For example, classifiers trained on data from one hospital often fail to generalize to data from other hospitals.
In our new work, joint with Adam Klivans and Konstantinos Stavropoulos, we introduce a framework called Testable Learning with Distribution Shift (TDS learning) to address this issue. The goal of our work is developing learning algorithms that allow their user to make sure that the performance of a learning algorithm is not deteriorating due to a distribution shift.
We consider the following setting. We are given a training dataset S_1, consisting of e.g. Gaussian examples which are labelled by an unknown function f in the function class. We are also given a new dataset S_2 of unlabelled data points which we need to classify. The concern is that even if a learning algorithm produces a classifier that performs very well on the distribution from which S_1 is drawn, it might perform terribly on S_2 due to a distribution shift.
Whenever distribution shift makes it unsafe to use the classifier given by the learning algorithm on the unlabeled dataset S_2, the TDS learning framework requires the learning algorithm to alert the user. In contrast, if no distribution shift has taken place, then the TDS learning framework requires the learning algorithm not to raise such an alert. In this case, the user can be confident that the classifier given by the learning algorithm is indeed safe to deploy on the dataset S_2.
Although domain-adaptation has been studied previously, most previous work bounds the error on S_2 in terms of some notion of distance between the distributions giving rise to S_1 and S_2. These distances seem computationally difficult to evaluate, as they involve enumeration of all elements of the function class.
In our work we give TDS learning algorithms that are not only sample-efficient but also computationally efficient. We handle a number of high-dimensional function classes, including linear classifiers, intersections of linear classifiers, decision trees and low-depth formulas.
To accomplish this, we develop a novel method based on showing that functions in a given function class can be sandwiched between pairs of low degree polynomials that are close in L_2-distance. We prove the existence of such pairs of L_2-sandwiching polynomials for a wide range of function classes, by building on techniques from pseudorandomness. We then show that the existence of these L_2-sandwiching polynomials can be used to achieve efficient TDS-learning algorithms by introducing our Transfer Lemma, which allows us to ensure a good performance on the set S_2 by checking that the low-degree moments of the set S2 approximately match those of S_1.
Furthermore, for TDS learning of linear classifiers under the Gaussian distribution, we give an algorithm that does provably better than any algorithm based on L_2-sandwiching polynomials. To accomplish this, we combine the moment-matching approach with a method inspired by literature on active learning.
Randomness Extraction from Adversarial Sources
Eshan Chattopadhyay from Cornell University
February 23, 2023 | Computer Science 105
The area of randomness extraction is concerned with designing efficient algorithms (extractors) that produce almost truly random bits from defective sources of randomness. A very well-studied model in this area is the independent source model, where the extractor has access to multiple independent sources, each of which contains some amount of entropy. In this talk, I will discuss a recent line of work that explores randomness extraction beyond the independence model, allowing some sources to be adversarial or faulty - such faulty sources may contain no entropy or have unwanted correlations. Our extractors are based on recent advances in the extractor machinery along with interesting use of tools from extremal graph theory and lower bound results in communication complexity.
Based on joint works with Jesse Goodman, Vipul Goyal, Xin Li and David Zuckerman.
Optimal Online Discrepancy Minimization
Victor Reis from Institute for Advanced Study
March 01, 2024 | Computer Science 105
We prove that there exists an online algorithm that for any sequence of vectors v_1, ..., v_T in R^n with Euclidean norm at most 1, arriving one at a time, decides random signs x_1, ..., x_T so that for every t <= T, the prefix sum x_1 v_1 + ... + x_t v_t is 10-subgaussian.
This improves over the work of Alweiss, Liu and Sawhney who kept prefix sums O(sqrt(log(nT)))-subgaussian, and gives a O(sqrt(log T)) bound on the prefix discrepancy. Our proof combines a generalization of Banaszczyk's prefix balancing result to trees with a cloning argument to find distributions rather than single colorings. We also show a matching Omega(sqrt(log T)) strategy for an oblivious adversary. Joint work with Janardhan Kulkarni and Thomas Rothvoss to appear in STOC 2024.
Can open decentralized ledgers be economically secure?
Jacob Leshno from Chicago Booth School of Business
March 08, 2024 | Computer Science 105
Traditional payment processors are the subject of antitrust concerns and regulations. Open decentralized ledgers (e.g., Bitcoin) provide an alternative. They do not rely on a central authority, avoiding antitrust and monopoly concerns. However, the open nature of these systems gives rise to many challenges, including fundamental questions about their security.
Using a framework that combines economic theory and distributed systems theory, we define economic security for general open decentralized ledgers.
Limitations of incentives in environments with many anonymous participants bring to question the security Bitcoin's Nakamoto protocol obtains from block rewards.
We present an alternative protocol showing that an open decentralized ledger can be economically secure.
(Joint with Rafael Pass and Elaine Shi)
Decoding Codes via Proofs
Madhur Tulsiani from TTIC
March 22, 2024 | Computer Science 105
In this talk, we will discuss a general framework for obtaining list decoding algorithms for codes, using relaxations based on the Sum-of-Squares (SoS) hierarchy of semidefinite programs. This framework is an adaptation of the well-known “proofs to algorithms” paradigm to the setting of codes. If the proof of the fact that all pairs of codewords have large distance can be expressed in a proof system corresponding to the SoS hierarchy, then one can use it to obtain a list decoding algorithm for the corresponding code. We will discuss a few examples of this phenomenon, including expander-based LDPC codes, for which no list decoding algorithms were previously known.
Based on joint work with Fernando Granha Jeronimo, Tushant Mittal, and Shashank Srivastava.
Strong spatial mixing for colorings on trees and its algorithmic applications
Kuikui Liu from MIT
March 29, 2024 | Computer Science 105
Correlation decay is a fundamental and important property of distributions arising in statistical physics and theoretical computer science. A longstanding conjecture is that the uniform distribution over proper q-colorings on any tree of maximum degree Δ exhibits a strong form of correlation decay whenever q≥Δ+1. It is surprising that such a basic question is still open, but then again it also highlights how much we still have to learn about random colorings. In this talk, I will discuss a near-resolution of this conjecture, as well as its algorithmic implications for sampling colorings on bounded-degree graphs via Glauber dynamics. Based on joint work with Zongchen Chen, Nitya Mani, and Ankur Moitra.
Data-Dependent LSH for the Earth Mover's Distance
Rajesh Jayaram from Google Research
April 05, 2024 | Computer Science 201
In this talk, we discuss the first improvement in approximation for nearest neighbor search under the Earth Mover's Distance (EMD) in over a decade. Given two sets of $s$ vectors $A,B$ in high dimensional space $(R^d)$, the EMD between $A$ and $B$ is the minimum cost of a perfect matching between the vectors in $A$ and $B$ where the edge weights are given by the distances in Euclidean space. EMD is a classic metric in computer science (dating back over 100 years to the Hungarian algorithm of Jacobi), and a standard distance between two sets of points. In nearest neighbor search, one has a collection of $n$ such sets $A_1,\ldots,A_n$, which one pre-processes so that given a query set $Q$, one can quickly return a set $A_i$ whose distance (under EMD) is within a $C$-factor of the nearest neighbor to $Q$.
To date, the only algorithm with sublinear $O(n^{\epsilon})$ query time was given by Andoni, Indyk, and Krauthgamer ([AIK, SODA '08]), who designed a (data-independent) locality sensitive hash function (LSH) for EMD with approximation $O(\log^2 s)$. In this work, we improve this approximation to $\tilde{O}(\log s)$ in the same runtime by designing the first data-dependent LSH functions for EMD. The talk will discuss the main techniques behind sublinear algorithms for EMD, including the tree embeddings techniques of [AIK’08] and [Chen, Jayaram, Levi, Waingarten STOC '22], as well as the key insights needed to glue them together into a improved LSH for EMD.
Joint work with Erik Waingarten and Tian Zhang (STOC '24, https://arxiv.org/abs/2403.05041).
Gaussian Polytope Approximators
Anindya De from University of Pennsylvania
April 12, 2024 | Computer Science 105
Abstract: We study the approximability of general convex sets in R^n
by intersections of halfspaces, where the approximation quality is
measured with respect to the standard Gaussian distribution and the
complexity of an approximation is the number of halfspaces used. While
a large body of research has considered the approximation of convex
sets by intersections of halfspaces under distance metrics such as the
Lebesgue measure and Hausdorff distance, prior to our work there has
not been a systematic study of convex approximation under the Gaussian
distribution.
We establish a range of upper and lower bounds, both for general
convex sets and for specific natural convex sets that are of
particular interest. Our results demonstrate that the landscape of
approximation is intriguingly different under the Gaussian
distribution versus previously studied distance measures. We establish
our bounds using techniques from many different areas, including
classical results from convex geometry, Cramér-type bounds from
probability theory, and—perhaps surprisingly—a range of topics from
computational complexity, including computational learning theory,
unconditional pseudorandomness, and the study of influences and noise
sensitivity in the analysis of Boolean functions.
Based on joint work with Shivam Nadimpalli and Rocco Servedio. Preprint available here: https://arxiv.org/abs/2311.08575.
Efficiently Finding Planted Cliques in Semirandom Graphs
Pravesh Kothari from Princeton University
April 19, 2024 | Computer Science 105
In this talk, I will present a new polynomial time algorithm for recovering planted cliques in the semi-random graph model of Feige and Kilian from 2001. In the Feige-Kilian model, a graph G on vertex set V is chosen by a combination of benign random and adaptive "adversarial" choices and can be thought of as a "robust" counterpart of the well-studied "fully-random" planted clique problem:
1) choose a planted clique on a set of k vertices S,
2) "benign random choice": include each edge between S and V\S independently at random,
3) "adaptive worst-case choice": deleting any set of edges between S and V\S and adding any set of edges with both endpoints in V\S.
The previous best algorithms for this model succeed if the planted clique has a size of at least n^{2/3} in a graph with n vertices. Our algorithms work for planted-clique sizes approaching n^{1/2} -- the information-theoretic threshold in the semi-random model and a conjectured computational threshold even in the easier fully-random model. This result comes close to resolving an open question of Feige (2019) and Steinhardt (2017). Our algorithm is based on high constant degree sum-of-squares relaxation of the clique SDP and the analysis relies on a new connection between finding planted cliques and efficient certificates of upper bounds on clique number of unbalanced bicliques in bipartite random graphs.
Based on joint work with Rares Buhai and David Steurer.
Theoretical Foundations of Real-World Cryptograph
Vinod Vaikuntanathan from MIT
April 26, 2024 | Computer Science 105
Block ciphers (aka Pseudorandom Permutations) are the workhorses of modern cryptography. An overwhelming fraction of Internet traffic is secured using the Advanced Encryption Standard (AES), a block cipher; the same goes for encrypted file systems, encrypted messaging, and several other applications we use all the time. Yet, despite the fact that the constructions are natural and elegant, we have woefully little mathematical rationale for their security. In this talk, I will describe a research program, together with a collection of recent results and techniques, aimed at achieving a better mathematical understanding of the security of real-world block ciphers including AES.
This talk is based on joint work with Angelos Pelecanos, Tianren Liu and Stefano Tessaro.
An Exponential Lower Bound for Linear 3-Query Locally Correctable Codes
Peter Manohar from Carnegie Mellon University
May 03, 2024 | Computer Science 105
A locally correctable code (LCC) is an error correcting code where one can recover any symbol of the original codeword by querying only a small number of randomly chosen symbols from the received corrupted codeword.
In this talk, we present a proof that the blocklength n of a linear 3-query LCC C: {0,1}^k -> {0,1}^n must be at least \exp(k^{1/8}), i.e., it grows exponentially with k. This improves on the prior best lower bound of k^3 shown in our recent work, which holds even for the weaker setting of 3-query locally decodable codes (LDCs), and comes close to matching the best-known construction of 3-query LCCs based on Reed—Muller codes, which achieve a blocklength of n = \exp(\sqrt{k}). Because there is a 3-query LDC with a strictly subexponential blocklength, as a corollary we obtain the first separation between q-query LCCs and LDCs for any constant q.
We subsequently show that any “design 3-LCC”, an LCC with a parity check matrix that is a block design, must have blocklength at least 2^{(1-o(1)) sqrt{k}}. As Reed—Muller codes are design 3-LCCs with blocklength n = 2^{2 sqrt{2k}}, this is tight up to a factor of 2 sqrt{2} in the exponent, and as a corollary resolves the Hamada conjecture for 4-designs up to this constant factor. For nonlinear LCCs, we give a generalization of our approach that proves a superpolynomial lower bound of k^{log k}, improving on the prior best lower bound of k^3.
Fall 2023 Schedule
Sparsifying Generalized Linear Models
Yang Liu from Institute for Advanced Study
September 15, 2023 | Computer Science 105
We study the sparsification of objectives $f(x) = \sum_{i=1}^m f_i(
Cryptographic Non-Locality
Alex Lombardi from Princeton University
September 22, 2023 | Computer Science 105
Non-local games are an important concept in quantum information theory that allow two or more provers to demonstrate, to a classical verifier, the ability to prepare certain kinds of quantum systems. Two important applications include (multi-prover) protocols for (1) the demonstration of quantum advantage and (2) verifiable BQP computation. Key to these protocols is the physical assumption that the "far apart" provers cannot communicate.
In this talk, we will discuss how to emulate the effects of non-local games in an interactive, single-prover protocol, replacing the physical non-locality assumption with a cryptographic hardness assumption. These ideas have led to new interactive protocols for quantum advantage as well as for BQP verification.
This talk is based on joint works with Yael Kalai, Vinod Vaikuntanathan, Lisa Yang, and Fermi Ma.
2-dimensional spectral expansion of random geometric graphs
Siqi Liu from Institute for Advanced Study
September 29, 2023 | Computer Science 105
A graph is said to be a 1-dimensional expander if the second eigenvalue of its adjacency matrix is bounded away from 1, or if it has no sparse vertex cuts. There are several natural ways to generalize the notion of expansion to hypergraphs/simplicial complexes, but one such way is 2-dimensional spectral expansion, in which the local expansion of vertex links/neighborhoods witnesses global expansion. While 1-dimensional expansion is known to be achieved by, e.g., random regular graphs, very few examples of sparse 2-dimensional expanders are known, and at present all are algebraic. It is an open question whether sparse 2-dimensional expanders are natural and abundant.
In this talk, we give some evidence towards abundance: we show that the set of triangles in a random geometric graph on a high-dimensional sphere yields an expanding simplicial complex of arbitrarily small polynomial degree.
Based on joint work with Sidhanth Mohanty, Tselil Schramm, and Elizabeth Yang.
Thin Trees for Laminar Families
Nathan Klein from Institute for Advanced Study
October 06, 2023 | Computer Science 105
In the strong form of the thin tree conjecture, formulated by Goddyn in 2004, we are given a k-edge-connected graph and wish to find a tree containing at most an O(1/k) fraction of the edges across every cut. This beautiful conjecture, if true, has implications in a number of areas such as graph theory and approximation algorithms. A natural relaxation of this problem is to find a tree with at most O(1/k) edges across a fixed collection of cuts, instead of all cuts. Via a simple reduction technique followed by iterative relaxation, we show that the thin tree conjecture is true for laminar families of cuts. This resolves an open question of Olver and Zenklusen from 2013, a work that showed thin trees exist for chains of cuts. As a byproduct, we also obtain the first constant factor approximation algorithm for the laminar-constrained spanning tree problem. Based on joint work with Neil Olver.
Theoretical foundations for diffusion models
Sitan Chen from Harvard University
October 27, 2023 | Computer Science 105
I will describe recent progress on providing rigorous convergence guarantees for score-based generative models (SGMs) such as denoising diffusion probabilistic models (DDPMs), which constitute the backbone of large-scale real-world generative models such as DALL⋅E. In the first part of the talk, I will show that such SGMs can efficiently sample from essentially any realistic data distribution, even ones which are highly non-log-concave. In the second part of the talk, I will show how to extend these guarantees to deterministic samplers based on discretizing the so-called probability flow ODE, which ultimately leads to better dependence on the dimension. All of these results assume access to an oracle for score estimation; time permitting, at the end I will briefly touch upon how to provably implement this oracle for interesting classes of distributions like Gaussian mixtures.
Streaming Zero-Knowledge Proofs
Marcel Dall'Agnol from Princeton University
November 03, 2023 | Computer Science 105
We define the notion of zero-knowledge in the streaming setting and construct zero-knowledge SIPs for the two main building blocks in the streaming interactive proofs literature: the sumcheck and polynomial evaluation protocols. To the best of our knowledge all known streaming interactive proofs are based on either of these tools, and indeed, this allows us to obtain zero-knowledge SIPs for central and well-studied streaming problems, such as index, frequency moments, and inner product. Our protocols are efficient both in terms of time and space, as well as communication: the space complexity is polylog(n) and, after a non-interactive setup that uses a random string of near-linear length, the remaining parameters are sub-polynomial.
En route, we develop a toolkit for designing zero knowledge data stream protocols that may be of independent interest, consisting of an algebraic streaming commitment protocol and a temporal commitment protocol. The analysis of our protocols relies on delicate algebraic and information-theoretic arguments and reductions from average-case communication complexity.
Samplers from high dimensional expanders and agreement testing
Yotam Dikstein from Institute for Advanced Study
November 10, 2023 | Computer Science 105
Sampler graphs are sparse graphs that "derandomize" Chernoff’s inequality. These graphs are equivalent to randomness extractors and are widely used in other areas of TCS. In this talk we will see that high dimensional expanders (HDXs) give rise to nearly optimal samplers. These are hypergraph analogues of expander graphs. Then we will see some applications such as reverse hypercontractivity inequalities on HDXs and new agreement theorems. This is a joint work with Max Hopkins (UCSD).
Testing Convex Truncation
Rocco Servedio from Columbia University
November 17, 2023 | Computer Science 105
We study the basic statistical problem of testing whether normally distributed -dimensional data has been truncated, i.e. altered by only retaining points that lie in some unknown truncation set . As our main algorithmic results,
We give a computationally efficient -sample algorithm that can distinguish the standard normal distribution from conditioned on an unknown and arbitrary convex set .
We give a different computationally efficient -sample algorithm that can distinguish from conditioned on an unknown and arbitrary mixture of symmetric convex sets.
These results stand in sharp contrast with known results for learning or testing convex bodies with respect to the normal distribution or learning convex-truncated normal distributions, where state-of-the-art algorithms require essentially samples. An easy argument shows that no finite number of samples suffices to distinguish from an unknown and arbitrary mixture of general (not necessarily symmetric) convex sets, so no common generalization of results (1) and (2) above is possible.
We also prove lower bounds on the sample complexity of distinguishing algorithms (computationally efficient or otherwise) that match our algorithms up to constant factors.
Joint with Anindya De and Shivam Nadimpalli.
Communication Complexity of Multiparty Set Disjointness Under Product Distributions
Rotem Oshman from Tel-Aviv University
December 01, 2023 | Computer Science 105
In the multiparty set disjointness problem, we have k players, with private inputs X1,...,Xk \subseteq [n]. The players’ goal is to check whether their sets intersect, or sometimes to compute the full intersection. It is known that in the shared blackboard model of communication, set disjointness requires Omega(n log k + k) bits of communication, and in the coordinator model, it requires Omega(nk) bits. However, these two lower bounds require that the players’ inputs can be highly correlated.
We study the communication complexity of multiparty set disjointness under product distributions, and ask whether the problem becomes significantly easier, as it is known to become in the two party case. Our main result is a nearly-tight bound of tilde-Theta(n^{1-1/k}) bits for both the shared blackboard model and the coordinator model. This shows that in the shared blackboard model, as the number of players grows, having independent inputs helps less and less; but
in the coordinator model, when k is very large, having independent inputs makes the problem much easier. Both our upper and our lower bounds use new ideas, as the original techniques developed for the two-party case do not scale to more than two players.
A new Information Complexity measure for multi-pass streaming algorithms
Sumegha Garg from Rutgers University
December 08, 2023 | Computer Science 105
In this talk, we will introduce a new notion of information complexity for multi-pass streaming problems, and use it to prove multi-pass lower bounds for the coin problem. In the coin problem, one sees a stream of 𝑛 i.i.d. uniform bits and one would like to compute the majority with constant advantage. We show that any constant pass algorithm must use Ω(log 𝑛) bits of memory, significantly extending an earlier Ω(log 𝑛) bit lower bound for single-pass algorithms of [BGW'20]. In the paper, we also use this information complexity notion to prove tight space complexity for the needle problem, which in turn implies tight bounds for the problem of approximating higher frequency moments in a data stream.
Joint work with Mark Braverman, Qian Li, Shuo Wang, David P. Woodruff and Jiapeng Zhang.
Tight Bounds for Active lp Regression
David Woodruff from Carnegie Mellon University
December 15, 2023 | Computer Science 105
In active learning, one seeks to learn the parameters of a model given examples, while minimizing the number of labels read. We give tight bounds up to logarithmic factors on the number of labels that need to be read for active l_p regression for every value of p, as a function of the number of variables and accuracy desired. Previously tight bounds were known only for p = 1 or p =2. We extend these ideas to a large class of loss functions with certain nice conditions, obtaining the first non-trivial bounds for such functions.
Based on joint works, one with Musco, Musco, and Yasuda (FOCS '22) and another with Yasuda (STOC '23).
Spring 2023 Schedule
Faster Walsh-Hadamard Transform from Matrix Non-Rigidity
Josh Alman from IAS
February 10, 2023 | Computer Science 105
The Walsh-Hadamard transform is a simple recursively-defined linear transformation with many applications throughout algorithm design and complexity theory. The folklore "fast Walsh-Hadamard transform" algorithm computes it using only N log N arithmetic operations, which is believed to be optimal up to constant factors. In this talk, I'll give two improved constructions:
- A new algorithm which uses $(23/24) N log N + O(N)$ arithmetic operations. This is, to my knowledge, the first improvement of the leading constant. (Based on joint work with Kevin Rao.)
- A new depth-2 linear circuit of size $O(N^{1.45})$, improving on size $O(N^{1.5})$ which one gets from the fast Walsh-Hadamard transform approach. (Based on joint work with Ashwin Padaki and Yunfeng Guan.)
A key idea behind both constructions is to take advantage of the matrix non-rigidty of the Walsh-Hadamard transform. A matrix is called "rigid" if it's impossible to change a "small" number of its entries to make it have "low" rank. L. Valiant showed that if a matrix is rigid, then this implies lower bounds for computing its linear transformation. However, a recent line of work has shown that many matrices of interest, including the Walsh-Hadamard transform, are actually not rigid. Although a formal converse to Valiant's result is not known, we'll make use of non-rigidity in our new constructions.
A Subpolynomial Approximation Algorithm for Graph Crossing Number in Low-Degree Graphs
Zihan Tan from Rutgers / DIMACS
February 17, 2023 | Computer Science 105
Graph Crossing Number is a fundamental and extensively studied problem with wide ranging applications. In this problem, the goal is to draw an input graph G in the plane so as to minimize the number of crossings between the images of its edges. The problem is notoriously difficult, and despite extensive work, non-trivial approximation algorithms are only known for bounded-degree graphs. Even for this special case, the best current algorithm achieves a O(n)-approximation, while the best current negative results do not rule out constant-factor approximation. All current approximation algorithms for the problem build on the same paradigm, which is also used in practice: compute a set E’of edges (called a planarizing set) such that G \ E’ is planar; compute a planar drawing of G \ E’; then add the drawings of the edges of E to the resulting drawing. Unfortunately, there are examples of graphs G, in which any implementation of this method must incur Ω(OPT2) crossings, where OPT is the value of the optimal solution. This barrier seems to doom the only currently known approach to designing approximation algorithms for the problem, and to prevent it from yielding a better than O(n)-approximation.
We propose a new paradigm that allows us to overcome this barrier. Using the new paradigm, we reduce the Crossing Number problem to Crossing Number with Rotation System – a variant of the Crossing Number problem, in which the ordering of the edges incident to every vertex is fixed as part of input. We then show a randomized algorithm for this new problem, that allows us to obtain a subpolynomial approximation for Graph Crossing Number on low-degree graphs.
This talk is based on joint work with Julia Chuzhoy and Sepideh Mahabadi.
Almost linear time algorithms for all flows
Sushant Sachdeva from University of Toronto
February 24, 2023 | Computer Science 105
Over the last decade or so, combining methods from continuous optimization and analysis with graph theoretic-insights has led to a revolution in algorithms for classic problems on graphs such as maximum flow.
In this talk, I will present some of our key ideas behind our recent work that gives almost-linear time algorithms for solving all convex flow problems on graphs. This implies almost linear time algorithms for max-flow, minimum-cost flow, bipartite matching, optimal transport, matrix scaling, isotonic regression, and several other well-studied problems.
Our algorithm is designed using a new Interior Point Method (IPM) that builds the flow as a sequence of almost-linear number of approximate undirected minimum-ratio cycles, each of which is computed and processed very efficiently using a new dynamic data structure.
Joint work with Li Chen, Rasmus Kyng, Yang Liu, Richard Peng, and Maximilian Probst Gutenberg.
Robust and Equitable Uncertainty Estimation
Aaron Roth from University of Pennsylvania
March 03, 2023 | Friend Center 004
Machine learning provides us with an amazing set of tools to make predictions, but how much should we trust particular predictions? To answer this, we need a way of estimating the confidence we should have in particular predictions of black-box models. Standard tools for doing this give guarantees that are averages over predictions. For instance, in a medical application, such tools might paper over poor performance on one medically relevant demographic group if it is made up for by higher performance on another group. Standard methods also depend on the data distribution being static --- in other words, the future should be like the past.
In this talk, I will describe new techniques to address both these problems: a way to produce prediction sets for arbitrary black-box prediction methods that have correct empirical coverage even when the data distribution might change in arbitrary, unanticipated ways and such that we have correct coverage even when we zoom in to focus on demographic groups that can be arbitrary and intersecting. When we just want correct group-wise coverage and are willing to assume that the future will look like the past, our algorithms are especially simple. Our approach is based on a quantile analogue of (mean) multi-calibration, a recent solution concept arising from the algorithmic fairness literature. We close by asking which distributional properties (beyond means and quantiles) we can achieve multicalibration with respect to, and provide a complete characterization via a connection to property elicitation.
This talk is based on two papers, that are joint work with Osbert Bastani, Varun Gupta, Chris Jung, Georgy Noarov, and Ramya Ramalingam.
Last-Iterate Convergence in Min-Max Optimization: SOS to the Rescue
Yang Cai from Yale University
March 10, 2023 | Computer Science 105
Min-max optimization is a classical problem in optimization and game theory and has recently played a central role in modern machine learning (ML) applications. These applications range from generative adversarial networks, adversarial examples, robust optimization, to multi-agent reinforcement learning. These ML applications present novel challenges to min-max optimization, necessitating that algorithms (i) demonstrate last-iterate convergence, in contrast to the more traditional average-iterate convergence, and (ii) be robust in nonconvex-nonconcave environments.
In this talk, we first focus on the last-iterate convergence rates of two classical and widely-used algorithms for solving convex-concave min-max optimization – the extragradient (EG) method by Korpelevich (1976) and the optimistic gradient (OG) method by Popov (1980). Despite the considerable attention these algorithms have received from both the optimization and machine learning communities over the years, the last-iterate convergence rates for both algorithms remained open. We resolve this open problem by showing both EG and OG exhibit a tight O(1/\sqrt{T}) last-iterate convergence rate under arbitrary convex constraints. In the second part of the talk, we discuss some recent progress on designing optimal first-order methods for structured nonconvex-nonconcave min-max optimization. Our results are obtained via a new sum-of-squares programming based approach, which may be useful in analyzing other iterative methods. The talk is based on joint work with Argyris Oikonomou and Weiqiang Zheng.
1) On the Structural Complexities of Matching Mechanisms; 2) A Computational Separation Between Quantum No-cloning and No-teleportation
Clay Thomas and Barak Nehoran from Princeton University
March 24, 2023 | Computer Science 105
Talk 1: School districts in many cities employ school choice mechanisms, i.e., algorithms matching students to schools on the basis of students’ reported preferences, and the priorities students have for attending each school. Since simplicity and ease of participation are paramount to these mechanisms, an exact understanding of their structural complexities—e.g., how the matching can be described, and how one student’s input can affect the outputs—should be a first-order concern.
In this talk, I will give one example of this type of complexity: that of describing the outcome matching. Prior work has observed that for one popular real-world mechanism (DA), the outcome can be described in terms of a single "priority cutoff" per school, even when there are many more students than there are schools. We prove that, for another popular mechanism (TTC), the outcome cannot be described using less than a bit for each pair of schools. This result clarifies and formalizes one way in which, as often informally argued in both the literature and in practice, TTC may be more complex than DA.
Depending on time, I may also mention connections to verification (in the sense of nondeterministic communication), and how we formalize the question "how sensitive is the mechanism to one student’s input" in this setting.
Based on joint work with Yannai Gonczarowski.
Talk 2: Two of the fundamental no-go theorems of quantum information are the no-cloning theorem (that it is impossible to make copies of general quantum states) and the no-teleportation theorem (the prohibition on sending quantum states over classical channels without pre-shared entanglement). They are known to be equivalent, in the sense that a collection of quantum states is clonable if and only if it is teleportable.
Our main result suggests that this is not the case when computational efficiency is considered. We give a collection of quantum states and oracles relative to which these states are efficiently clonable but not efficiently teleportable. Given that the opposite scenario is impossible (states that can be teleported can always trivially be cloned), this gives the most complete oracle separation possible between these two important no-go properties.
In doing so, we introduce a related quantum no-go property, reconstructibility, which refers to the ability to construct a quantum state from a uniquely identifying classical description. We show the stronger result of a collection of quantum states that are efficiently clonable but not efficiently reconstructible. This novel no-go property only exists in relation to computational efficiency, as it is trivial for unbounded computation. It thus opens up the possibility of further computational no-go properties that have not yet been studied because they do not exist outside the computational context.
Joint work with Mark Zhandry.
New algorithms and lower bounds for decision tree learning
Li-Yang Tan from Stanford University
April 07, 2023 | Computer Science 105
Constructing decision tree representations of data is a basic problem of computer science, sitting right at the boundary of our understanding of efficient computation. I will discuss two recent results about this problem, a new algorithm and a new lower bound:
1. An almost-polynomial time membership query algorithm for the uniform-distribution setting. The previous fastest algorithm ran in quasipolynomial time, a consequence of a classic algorithm of Ehrenfeucht and Haussler from 1989. (https://dl.acm.org/doi/abs/10.1145/3561047)
2. A conditional quasipolynomial lower bound for the distribution-free setting, matching the upper bound of Ehrenfeucht and Haussler. This builds on and improves upon prior work by Alekhnovich, Braverman, Feldman, Klivans, and Pitassi from 2004. (https://arxiv.org/abs/2210.06375)
Joint work with Guy Blanc, Caleb Koch, Jane Lange, Mingda Qiao, and Carmen Strassle.
The Kikuchi Matrix Method
Pravesh Kothari from Carnegie Mellon University
April 14, 2023 | Friend Center 004
In this talk, I will present a new method that reduces understanding an appropriate notion of girth in hypergraphs to unraveling the spectrum of a "Kikuchi" matrix associated with the hypergraph.
I will discuss three applications of this technique:
1) Finding a refutation algorithm for smoothed instances of constraint satisfaction problems (obtained by randomly perturbing the literal patterns in a worst-case instance with a small probability) that matches the best running-time vs constraint-density trade-offs for the significantly special and easier case of random CSPs,
2) Confirming Feige's 2008 Conjecture that postulated an extremal girth vs density trade-off (a.k.a. Moore bounds) for k-uniform hypergraphs that generalizes the Alon-Hoory-Linial Moore bound for graphs,
3) Proving a cubic lower bound on the block length of 3 query locally decodable codes improving on the prior best quadratic lower bound from the early 2000s.
Based on joint works with Omar Alrabiyah (Berkeley), Tim Hsieh (CMU), Peter Manohar (CMU), Sidhanth Mohanty (Berkeley), and Venkat Guruswami (Berkeley).
On Regularity Lemma and Barriers in Streaming and Dynamic Matching
Sanjeev Khanna from University of Pennsylvania
April 21, 2023 | Computer Science 105
We present a new approach for finding matchings in dense graphs by building on Szemeredi's celebrated Regularity Lemma. This allows us to obtain non-trivial albeit (very) slight improvements over long-standing bounds for computing near-optimal matchings in streaming and dynamic graphs. The key idea behind our improved bounds is to maintain a matching cover which is a “sparse” subgraph that approximately preserves matchings in each induced subgraph of the input graph. However, given the use of regularity lemma, the improvement obtained by our algorithms over trivial bounds is only by some function of $(log* n)$. Nevertheless, these results show that the “right” answer to these problems is not what is dictated by the previous bounds.
This is joint work with Sepehr Assadi, Soheil Behnezhad, and Huan Li.
Unitary Complexity and the Uhlmann Transformation Problem
Henry Yuen from Columbia University
April 28, 2023 | Computer Science 105
Three decades of quantum complexity theory have largely focused on quantum algorithms for classical tasks -- those with classical inputs and outputs. However there has been increasing interest in studying the computational difficulty of tasks with quantum inputs and/or outputs. Examples include preparing ground states of Hamiltonians or breaking quantum cryptographic protocols. Many techniques and approaches from traditional complexity theory are inadequate for reasoning about such inherently quantum tasks, suggesting a need for a "fully quantum" complexity theory.
In this talk I discuss some facets and themes of a "fully quantum" complexity theory. In particular I show that seemingly-unrelated quantum tasks such as achieving capacity over a noisy quantum channel, breaking a quantum bit-commitment scheme, and decoding the radiation of an evaporating black hole are all polynomial-time equivalent to something called the Uhlmann Transformation Problem. I also show that the Uhlmann Transformation Problem is complete for a "unitary complexity" analogue of SZK. Along the way I describe our framework for classifying the complexity of unitary synthesis problems.
Joint work with John Bostanci, Yuval Efron, Tony Metger, Alexander Poremba, and Luowen Qian.
Estimating the Longest Increasing/Common Subsequences
Alexandr Andoni from Columbia University
May 05, 2023 | Computer Science 105
The decades-old Patience Sorting algorithm computes the Longest Increasing Subsequence (LIS) of a sequence of length n in O(n log n) time. Yet, it is unknown how well one can estimate the (length of the) LIS in *sublinear* time, say, when it is >sqrt{n}?
We present an algorithm that, given a sequence with LIS > n/k, approximates the LIS up to a factor of k^c in k*n^c time for any arbitrarily small c>0. Our run-time essentially matches the trivial sample complexity lower bound of ~k. All the past work had polynomial factors in both approximation and extra time slowdown.
An important corollary is the first algorithm to estimate the Longest *Common* Subsequence in O(n) time up to a sub-polynomial, n^o(1), factor.
Our algorithm is based on a van Emde Boas style recursion. To accomplish such a recursion for a sublinear algorithm, we develop a generic sampling scheme that allows efficient composition of recursive sampling algorithms.
Joint work with Negev Shekel Nosatzki, Sandip Sinha, and Cliff Stein.
Fall 2022 Schedule
Random restrictions on boolean functions with small influences
Pei Wu from IAS
September 16, 2022 | Friend Center Convocation Room | Talk video
In the talk, we discuss the probability of Boolean functions with small max influence to become constant under random restrictions. Let f be a Boolean function such that the variance of f is $\Omega(1)$ and all its individual influences are bounded by $\tau$. We show that when restricting all but a $\tilde{\Omega}((\log1/\tau)^{-1})$ fraction of the coordinates, the restricted function remains nonconstant with overwhelming probability. This bound is essentially optimal, as witnessed by the tribes function $AND_{n/C\log n} \circ OR_{C\log n}$.
We extend it to an anti-concentration result, showing that the restricted function has nontrivial variance with probability $1-o(1)$. This gives a sharp version of the "it ain't over till it's over" theorem due to Mossel, O'Donnell, and Oleszkiewicz.
Negative-Weight Single-Source Shortest Paths in Near-linear Time
Aaron Bernstein from Rutgers University
September 23, 2022 | Friend Center 004
We present a randomized algorithm that computes single-source shortest paths (SSSP) in $O(m\log^8(n)\log W)$ time when edge weights are integral, can be negative, and are $\geq -W$. This essentially resolves the classic negative-weight SSSP problem. The previous bounds are $\~{O}((m+n^{1.5})\log W)$ and $m^{4/3+o(1)}\log W$. Near-linear time algorithms were known previously only for the special case of planar directed graphs. Also, independently of our work, the recent breakthrough on min-cost flow [Chen, Kyng, Liu, Peng, Probst-Gutenberg and Sachdeva] implies an algorithm for negative-weight SSSP with running time $m^{1+o(1)}$.
In contrast to all recent developments that rely on sophisticated continuous optimization methods and dynamic algorithms, our algorithm is simple: it requires only a simple graph decomposition and elementary combinatorial tools. In fact, ours is the first combinatorial algorithm for negative-weight SSSP to break through the classic $O(m\sqrt{n}\log nW)$ bound from over three decades ago [Gabow and Tarjan SICOMP'89].
Kolmogorov Complexity Characterizes Statistical Zero Knowledge
Eric Allender from Rutgers University
September 30, 2022 | Friend Center 006
We show that a decidable promise problem has a non-interactive statistical zero-knowledge proof system if and only if it is randomly reducible to a promise problem for Kolmogorov-random strings, with a superlogarithmic additive approximation term. This extends recent work by Saks and Santhanam (CCC 2022). We build on this to give new characterizations of Statistical Zero Knowledge (SZK), as well as the related classes NISZK_L and SZK_L.
This is joint work with Shuichi Hirahara and Harsha Tirumala. A paper is available as ECCC TR22-127.
New Directions in Derandomization: Non-Black-Box Techniques, Superfast Algorithms
Roei Tell from IAS
October 07, 2022 | Friend Center 004
This talk will present two new directions in the study of derandomization, which are related to each other. The talk is based on a sequence of recent works with Lijie Chen (some of the works are upcoming).
The first direction is avoiding classical PRGs in favor of *non-black-box techniques*, which extract pseudorandomness for a probabilistic machine from the code of that specific machine. This approach allows constructing derandomization algorithms from weaker hypotheses (compared to classical results), and also constructing very fast derandomization algorithms that cannot be obtained using classical PRGs. The second direction is the fine-grained study of *superfast derandomization*, asking what is the smallest possible overhead that we need to pay when eliminating randomness (and whether we can get away with paying nothing at all).
As examples, we will mention four results:
1. Connecting the BPP = P conjecture to new types of lower bounds.
2. "Free lunch" derandomization with essentially no time overhead.
3. Superfast derandomization of interactive proof systems.
4. Average-case derandomization from weak hardness assumptions. (Joint work with Ron Rothblum.)
Almost Ramanujan Expanders from Arbitrary Expanders via Operator Amplification
Fernando Granha Jeronimo from IAS
October 14, 2022 | Friend Center 004
Expander graphs are fundamental objects in theoretical computer science and mathematics. They have numerous applications in diverse fields such as algorithm design, complexity theory, coding theory, pseudorandomness, group theory, etc.
In this talk, we will describe an efficient algorithm that transforms any bounded degree expander graph into another that achieves almost optimal (namely, near-quadratic, $d \leq 1/\lambda^{2+o(1)}$) trade-off between (any desired) spectral expansion $\lambda$ and degree $d$. The optimal quadratic trade-off is known as the Ramanujan bound, so our construction gives almost Ramanujan expanders from arbitrary expanders.
This transformation preserves structural properties of the original graph, and thus has many consequences. Applied to Cayley graphs, our transformation shows that any expanding finite group has almost Ramanujan expanding generators. Similarly, one can obtain almost optimal explicit constructions of quantum expanders, dimension expanders, monotone expanders, etc., from existing (suboptimal) constructions of such objects.
Our results generalize Ta-Shma's technique in his breakthrough paper [STOC 2017] used to obtain explicit almost optimal binary codes. Specifically, our spectral amplification extends Ta-Shma's analysis of bias amplification from scalars to matrices of arbitrary dimension in a very natural way.
Joint work with: Tushant Mittal, Sourya Roy and Avi Wigderson
Convex Spherical Cubes
Oded Regev from NYU
October 28, 2022 | Friend Center 004
We show that there exists a convex body that tiles $\mathbb{R}^n$ under translations by $\mathbb{Z}^n$ whose surface area is $\tilde{O}(\sqrt{n})$. No asymptotic improvement over the trivial $O(n)$ (achieved by the cube $[0,1]^n$) was previously known. A non-convex body achieving surface area $O(\sqrt{n})$ was constructed by Kindler, O’Donnell, Rao, and Wigderson [FOCS 2008].
For some fun background on this question, see Kindler et al.'s CACM article: https://cacm.acm.org/magazines/2012/10/155546-spherical-cubes/fulltext and Klarreich's broader review: https://www.jstor.org/stable/27857995.
Joint work with Assaf Naor.
Maintaining Privacy of the Minority without Ignoring It: Differentially Private Oversampling for Imbalanced Learning
Rachel Cummings from Columbia University
November 04, 2022 | Friend Center 004
The problem of learning from imbalanced datasets, where the label classes are not equally represented, arises often in practice. A widely used method to combat this problem is preprocessing techniques to amplify the minority class, e.g., by reusing or resampling the existing minority instances. However, when the training data are sensitive and privacy is a concern, this will also amplify privacy leakage from members of the minority class. Therefore this oversampling pre-processing step must be designed with privacy in mind. In this work, we present an algorithm for differentially private synthetic minority oversampling technique (DP-SMOTE) that guarantees formal privacy for the minority class. We show empirically that there is no statistically significant loss in performance relative to existing non-private SMOTE tools. We also show that even when non-private SMOTE is used as a part of a DP learning pipeline as a pre-processing step before downstream private classification, the upstream non-private oversampling increases sensitivity exponentially in the dimensionality of the data, which harms classification accuracy at a corresponding rate. The algorithmic structure also yields insights about the implications for pre-processing before the use of DP algorithms, which may be of independent interest.
(joint work with Yuliia Lut, Eitan Turok, and Marco Avella Medina)
Algorithmic Applications of Hypergraph and Partition Containers
Or Zamir from IAS and Princeton University
November 11, 2022 | Friend Center 004
The relatively new Hypergraph Container Method (Saxton and Thomason 2015, Balogh, Morris and Samotij 2015), generalizing an earlier version for graphs (Sapozhenko 2001), has been used extensively in recent years in extremal combinatroics. In essence, it shows that independent sets in regular-enough uniform hypergraphs must be clustered in some fashion. We show how to use this method algorithmically to convert algorithms for arbitrary inputs into faster algorithms for almost-regular input instances. Informally, an almost-regular input is an input in which the maximum degree is larger than the average degree by at most a constant factor. This vastly generalizes several families of inputs for which we commonly have improved algorithms.
An important component of our work is the generalization of (hyper-)graph containers to Partition Containers. While standard containers apply to a single independent set, partition containers apply to several independent sets at the same time.
Our first main application is a resolution of a major open problem about Graph Coloring algorithms, for almost-regular graphs. The chromatic number of a graph can be computed in $2^n$. Better algorithms are known only for $k<7$. We solve $k$-coloring in $(2-\epsilon)^n$ time for graphs in which the maximum degree is at most $C$ times the average degree, where $\epsilon> 0$ for any $k, C$. This includes, for example, regular graphs of any degree (with $C = 1$), and graphs with at least $\epsilon n^2$ edges for any fixed constant $\epsilon$ (with $C = 1 / 2 \epsilon$).
Our second main application is the first improved $k$-SAT algorithm for dense formulas. The celebrated sparsification lemma (Impagliazzo, Paturi and Zane 2001) implies that solving $k$-SAT can be reduced to solving $k$-SAT on sparse formulas. We show a complementing statement; For every $k, b$ there exists a $C$ such that if $k$-SAT can be solved in $b^n$ time, then $k$-SAT on formulas with at least $C \cdot n$ clauses that are well-spread can be solved in $(b-\epsilon)^n$ time.
We give additional applications including faster algorithms for unweighted and weighted Maximum Independent Set in almost-regular graphs.
Linear Probing Revisited: Tombstones Mark the Demise of Primary Clustering
William Kuszmaul from MIT
November 18, 2022 | Friend Center 004
The linear-probing hash table is one of the oldest and most widely used data structures in computer science. However, linear probing also famously comes with a major drawback: as soon as the hash table reaches a high memory utilization, elements within the hash table begin to cluster together, causing insertions to become slow. This phenomenon, now known as "primary clustering", was first captured by Donald Knuth in 1963; at a load factor of $1 - 1/x$, the expected time per insertion becomes $\Theta(x^2)$, rather than the more desirable $\Theta(x)$.
We show that there is more to the story than the classic analysis would seem to suggest. It turns out that small design decisions in how deletions are implemented have dramatic effects on the asymptotic performance of insertions. If these design decisions are made correctly, then even a hash table that is continuously at a load factor $1 - \Theta(1/x)$ can achieve average insertion time $\tilde{O}(x)$. A key insight is that the tombstones left behind by deletions cause a surprisingly strong "anti-clustering" effect, and that when insertions and deletions are one-for-one, the anti-clustering effects of deletions actually overpower the clustering effects of insertions.
We also present a new variant of linear probing, which we call "graveyard hashing", that completely eliminates primary clustering on any sequence of operations. If, when an operation is performed, the current load factor is $1 - 1/x$ for some $x$, then the expected cost of the operation is $O(x)$.
Based on joint work with Michael A. Bender and Bradley C. Kuszmaul. FOCS '21. Available at https://arxiv.org/abs/2107.01250.
1) On the Hardness of Dominant Strategy Mechanism Design; 2) Memory-Sample Lower Bounds for Learning with Classical-Quantum Hybrid Memory
Shiri Ron and Wei Zhan from Weitzmann Institute and Princeton University
December 02, 2022 | Friend 004
Talk 1: We study the communication complexity of dominant strategy implementations of combinatorial auctions. We start with two domains that are generally considered “easy”: multi-unit auctions with decreasing marginal values and combinatorial auctions with gross substitutes valuations. For both domains we have fast algorithms that find the welfare-maximizing allocation with communication complexity that is poly-logarithmic in the input size. This immediately implies that welfare maximization can be achieved in an ex-post equilibrium with no significant communication cost, by using VCG payments. In contrast, we show that in both domains the communication complexity of any dominant strategy implementation that achieves the optimal welfare is polynomial in the input size. We then study the approximation ratios achievable by dominant strategy mechanisms. For combinatorial auctions with general valuations, we show that no dominant-strategy mechanism achieves an approximation ratio of $m^{1−\eps}$, where m is the number of items. In contrast, a randomized dominant strategy mechanism that achieves an O(√m) approximation. This proves the first gap between computationally efficient deterministic dominant strategy mechanisms and randomized ones. Joint work with Shahar Dobzinski and Jan Vondrak.
Talk 2: In this talk, I will present our study on classical learning tasks in the presence of quantum memory. We prove that any quantum algorithm with both, classical memory and quantum memory, for parity learning on n bits, requires either Ω(n^2) bits of classical memory or $\Omega(n)$ bits of quantum memory or an exponential number of samples. The work by Raz showed that any classical algorithm requires either Ω(n^2) bits of memory or an exponential number of samples, and our results refute the possibility that a small amount of quantum memory significantly reduces the size of classical memory needed for efficient learning. I will briefly review the techniques used in the proof for classical memory-sample lower bound, discuss what prevents us from directly applying these techniques to the quantum case, and give a sketch of the actual proof for our results. This is a joint work with Qipeng Liu and Ran Raz.
Optimal Weak to Strong Learning
Kasper Green Larsen from Aarhus University
December 09, 2022 | CS 105
The classic algorithm AdaBoost allows to convert a weak learner, that is an algorithm that produces a hypothesis which is slightly better than chance, into a strong learner, achieving arbitrarily high accuracy when given enough training data. We present a new algorithm that constructs a strong learner from a weak learner but uses less training data than AdaBoost and all other weak to strong learners to achieve the same generalization bounds. A sample complexity lower bound shows that our new algorithm uses the minimum possible amount of training data and is thus optimal. Hence, this work settles the sample complexity of the classic problem of constructing a strong learner from a weak learner.
This work was accepted at NeurIPS’22 and is joint work with Martin Ritzert from Aarhus University
Pseudorandom rotations: explicit approximate designs for some Lie groups
Pedro Paredes from Princeton University
December 16, 2022 | CS 105
Since the introduction of randomness as an algorithmic tool, researchers have been trying to study how to reduce the amount of randomness that these probabilistic algorithms need. This is achieved through lots of different ways, but a common one is to develop algorithms that produce objects (like sequences of bits) that "look random" enough that they fool probabilistic algorithms. These algorithms that produce random looking objects are known as pseudorandom generators. In addition to the intrinsic motivation of understanding how randomness and computation interact, there is a broader appeal to using these tools to find explicit constructions of objects that are known to exist through probabilistic methods.
One of the many successes of this area is the observation that very often we only need bounded independence. Broadly speaking, sampling a truly uniform random $N$-dimensional object may require $\Omega(N)$ bits of randomness, but for many applications it is sufficient to use objects that only look truly random when inspecting $k \ll N$ dimensions, known as "$k$-wise independent". One basic example of this is $k$-wise independent permutations: a distribution over of permutations on $N$ elements is $k$-wise independent if a permutation sampled from it maps any distinct $k$ elements to any distinct $k$ elements equally likely (i.e. any subset of $k$ elements looks uniform). Thanks to Kassabov [Kas07] and Kaplan–Naor–Reingold [KNR09], it turns out that such a distribution can be created using only $O(k\log N)$ random bits to sample from (with a small error), much less than in the fully independent case.
In this talk, we present a framework that generalizes these works to obtain $k$-wise independent approximate orthogonal and unitary matrices (formally known as approximate orthogonal/unitary designs). Given a classical group of Lie type with certain properties, our framework uses the properties of the group to approximate to "$k$-locally" approximate the Haar (uniform) measure over elements of the group. We do so by first showing how to obtain a pseudorandom generator with a non-trivial yet big error, and then showing how to reduce this error using a generalization of expander random walks to operators.
Joint work with Ryan O'Donnell and Rocco Servedio