TOC Seminars Fall 2012

September 6


Ronald de Wolf

Exponential Lower Bounds for Polytopes in Combinatorial Optimization

We solve a 20-year old problem posed by Yannakakis and prove that there exists no polynomial-size linear program (LP) whose associated polytope projects to the traveling salesman polytope, even if the LP is not required to be symmetric. Moreover, we prove that this holds also for the cut polytope and the stable set polytope. These results were discovered through a new connection that we make between one-way quantum communication protocols and semidefinite programming reformulations of LPs. Joint work with Samuel Fiorini, Serge Massar, Sebastian Pokutta, Hans Raj Tiwary, at STOC'12 (shared Best Paper Award).

September 11

Costis Daskalakis

Algorithmic Multidimensional Mechanism Design

In his seminal paper, Myerson [1981] provides a revenue-optimal auction for a seller who is looking to sell a single item to multiple bidders. Extending this auction to simultaneously selling multiple heterogeneous items has been one of the central problems in Mathematical Economics. We provide such an extension that is also computationally efficient.

mit news article 

(joint work with Yang Cai, Matt Weinberg)

September 18

Eric Blais

Partially Symmetric Functions are Efficiently Isomorphism-Testable

The function isomorphism testing question is the following basic problem in property testing: given some function f, how many queries do we need to distinguish functions that are identical to f up to relabeling of the input variables from functions that are far from being so? Despite much recent attention to this problem, essentially only two classes of functions were known to be isomorphism-testable with a constant number of queries: symmetric functions and juntas.

In this talk, we will see that all partially symmetric functions---a class of functions first considered by Shannon in '49 that includes juntas, symmetric functions, and many other functions---are isomorphism-testable with a constant number of queries. The two main ingredients in the proof of this result are a new notion of symmetric influence of variables in boolean functions and a new analysis of the junta testing algorithm that replaces earlier Fourier machinery with a purely combinatorial argument involving intersecting families.

Based on joint work with Amit Weinstein and Yuichi Yoshida to appear at FOCS '12.

September 25

Andrew McGregor
UMASS Amherst 

Graph Sketching

Applying random linear projections, a.k.a. "sketching", has become a standard technique when analyzing high-dimensional data sets. The resulting algorithms are embarrassingly parallelizable and suitable for stream processing. However, existing results have largely focused on vector-based problems (e.g., estimating norms and reconstructing sparse signals) and linear-algebraic problems (e.g., regression and low-rank approximation). In this work, we ask whether the richer combinatorial structure of graphs can also be analyzed via small sketches. We present a range of algorithmic results and applications including the first single-pass algorithm for constructing graph sparsifiers of fully-dynamic graph streams (i.e., where edges can be added and deleted in the underlying graph). 

Based on joint work with Kook Jin Ahn and Sudipto Guha (SODA 12, PODS 12).

October 2 SPECIAL ROOM: 32-141

James Orlin
MIT Sloan 

Max flows in O(nm) time, or better

In this paper, we present improved polynomial time algorithms for the max flow problem defined on a network with n nodes and m arcs. We show how to solve the max flow problem in O(nm) time, improving upon the best previous algorithm due to King, Rao, and Tarjan, who solved the max flow problem in O(nm log_{m/(n log n)}n) time. In the case that m = O(n), we improve the running time to O(n^2/log n). We further improve the running time in the case that U^* = U_{max}/U_{min} is not too large, where U_{max} denotes the largest finite capacity and U_{min} denotes the smallest non-zero capacity. If log(U^*) = O(n^{1/3} \log^{-3} n), we show how to solve the max flow problem in O(nm / log n) steps. In the case that log(U^*) = O(\log^k n) for some fixed positive integer k, we show how to solve the max flow problem in O(n^{8/3}) time up to poly-logarithmic factors. This latter algorithm relies on a subroutine for fast matrix multiplication.

Population Recovery and Partial Identification --- SPECIAL LOCATION E25-111

We study several natural problems in which an {\em unknown} distribution over an {\em unknown} population of vectors needs to be recovered from partial or noisy samples. Such problems naturally arise in a variety of contexts in learning, clustering, statistics, data mining and database privacy, where loss and error may be introduced by nature, inaccurate measurements, or on purpose. We give fairly efficient algorithms to recover the data under fairly general assumptions, when loss and noise are close to the information theoretic limit (namely, nearly completely obliterate the original data).

Underlying one of our algorithms is a new structure we call a {\em partial identification (PID) graph}. While standard IDs are subsets of features (vector coordinates) that uniquely identify an individual in a population, partial IDs allow ambiguity (and "imposters"), and thus can be shorter. PID graphs capture this imposter-structure. PID graphs yield strategies for dimension reductions of recovery problems, and the re-assembly of this local pieces of statistical information to a global one. The combinatorial heart of this work is proving that every set of vectors admits partial IDs with "cheap" PID graphs (and hence efficient recovery). We further show how to find such near-optimal PIDs efficiently.

Time permitting, I will also describe our original motivation for studying these recovery problems above: a new learning model we call "restriction access", introduced in earlier work. This model aims at generalizing prevailing "black-box" access to functions when trying to learn the "device" (e.g. circuit, decision tree, polynomial...) which computes them. We propose a "grey-box" access that allows certain partial views of the device, obtained from random restrictions. Our recovery algorithms above allow positive learning results for the PAC-learning analog of our model, for such devices as decision trees and DNFs, which are currently beyond reach in the standard "black-box" version of PAC-learning.

Based on joint works with Zeev Dvir, Anup Rao and Amir Yehudayoff


Alon Orlitsky

Competitive Classification

Building on recent computer-science and information theoretic approaches, we derive competitive algorithms for the classical classification problem. A classifier takes two training and one test sequence, and associates the test sequence with the training sequence that was generated by the same distribution. With no assumptions on the support size or the distance between the underlying distributions, we construct a linear-complexity classifier that requires at most n^(3/2) samples to attain the n-sample accuracy of the best classifier designed with all essential knowledge of the two distributions. Conversely, we show that for any classifier, there are distributions that require at least n^(7/6) samples to achieve the n-sample accuracy of the best classifier designed with knowledge of the distributions. Joint work with Jayadev Acharya, Hirakendu Das, Ashkan Jafarpour, Shengjun Pan, and Ananda Suresh.

October 23

No seminar - FOCS 2012


October 30

Irit Dinur
Weizmann and Harvard/MIT 

Covering CSPs

We study the complexity of constraint satisfaction problems (CSPs) from a new "covering" perspective. We define the covering number of a CSP instance C, denoted v(C), to be the smallest number of assignments to the variables, such that each constraint is satisfied by at least one of the assignments. This covering notion describes situations in which we must satisfy all the constraints, and are willing to use more than one assignment to do so. At the same time, we want to minimize the number of assignments. We study the covering CSP problem for different constraint predicates. We also study an interesting relationship between this problem and approximate coloring of graphs. Based on joint work with Gillat Kol.

November 6

Prasad Raghavendra
UC Berkeley 

Polymorphisms and Polynomial-time algorithms

Why is 2-SAT solvable in polynomial time but 3-SAT NP-complete? Why is MaxCut approximable to a ratio of 0.878, but unique-games hard to approximate any better? Why is minimum s-t cut polytime solvable, but 3-way cut approximable to a 12/11 factor? Why is submodular minimization easy in general? In all these cases, the existence of polymorphisms seems to characterize the existence of polynomial time algorithms. Roughly speaking, a polymorphism is an operation to combine different solutions to a problem, that preserves the constraints or the objective value of the problem. For the class of exact constraint satisfaction problems (CSP), the algebraic dichotomy conjecture asserts that a CSP is polytime solvable if and only if the CSP admits non-trivial polymorphisms. The algebraic approach to exact CSPs has been highly successful in characterizing large classes of exact CSPs, including all CSPs over domain size up to 4. More recently, an analogous theory for approximation of Max-CSPs was developed and shown to be equivalent to the well-known unique games conjecture. Thus polymorphisms seemingly characterize existence of polytime algorithms for CSPs, both for exact CSPs and approximation versions. In this work, we initiate the study of polymorphisms to classes of problems beyond CSPs. Specifically, we show the following result: 1) Suppose we are given access to a function F in the value oracle model, along with an operation/polymorphism on the domain that never increases the value of F. Submodular minimization is a well-known example of this nature, with the operation being 2-bit AND and 2-bit OR. We show that under certaing restrictions on the operation/polymorphism, the function F can be minimized in polynomial time. This shows that the tractability of submodular minimization in the value-oracle model is a special case of a more general phenomenon.

November 13

Shay Mozes

Multiple-Source Multiple-Sink Maximum Flow in Directed Planar Graphs in Near-Linear Time

I will present an an O(n log^3 n) algorithm that, given an n-node directed planar graph with arc capacities, a set of source nodes, and a set of sink nodes, finds a maximum flow from the sources to the sinks. Previously, the fastest algorithms known for this problem were those for general graphs. The main new tool is a procedure for computing a sequence of exact st-max-flow computations in an n-node graph that takes roughly O(\sqrt n) time per computation. The procedure is based on a beautiful relation between circulations in a planar graph and shortest paths in its planar dual. Based on joint work with Gelncora Borradaile, Philip Klein, Yahav Nussbaum and Christian Wulff-Nilsen.

November 20

No Seminar- Thanksgiving week



Alex Dimakis

Coding for Distributed Storage

Modern large-scale storage systems are deploying coding to introduce high reliability with limited storage overhead. It was recently discovered that network coding techniques can improve the maintenance of such distributed coded systems compared to standard Reed-Solomon codes. We will cover the developing theory and practice of this new family of network codes called regenerating codes. We will focus on recent developments and connections to interference alignment and locally decodable codes. Finally, we will discuss recent implementations over the Hadoop distributed file system and practical aspects that arise.


Piotr Indyk

Faster Algorithms for the Sparse Fourier Transform

The Fast Fourier Transform (FFT) is one of the most fundamental numerical algorithms. It computes the Discrete Fourier Transform (DFT) of an n-dimensional signal in O(n log n) time. The algorithm plays an important role in many areas. It is not known whether its running time can be improved. However, in many applications (e.g., audio, image or video compression), most of the Fourier coefficients of a signal are "small" or equal to zero, i.e., the output of the transform is (approximately) sparse. In this case, it is possible to compute the set of non-zero coefficients faster than in O(n log n) time. In this talk, I will describe a new set of efficient algorithms for sparse Fourier Transform. One of the algorithms has running time of O(k log n), where k is the number of non-zero Fourier coefficients of the signal. This improves over the runtime of the FFT for any k = o(n). The talk will cover (a subset of) the material from the joint papers with Fadel Adib, Badih Ghazi, Haitham Hassanieh, Dina Katabi, Eric Price and Lixin Shi. The papers are available at

December 11

Eli Ben-Sasson
Technion and MIT 

An additive combinatorics approach relating communication complexity to rank

One of the outstanding questions in communication complexity is that of relating the rank of a {0,1}-valued matrix M to its communication complexity. Mehlhorn and Schmidt [STOC 1982] showed that

log^{O(1)} rank(M) < Comm. Complexity (M) < O(rank(M)) 

and these asymptotic bounds have not changed in the past 30 years (but for the constants hidden in the asymptotic notation). 
We show, assuming the Polynomial Freiman-Ruzsa (PFR) conjecture in additive combinatorics, and using tools from that field, that 

Comm. Complexity(M) < O(rank(M)/ log rank(M)). 

Our proof is based on the study of the "approximate duality conjecture" [Ben-Sasson and Zewi STOC 2011] which has other applications to the construction of bipartite Ramsey graphs and to locally decodable codes. 

Joint work with Shachar Lovett (IAS) and Noga Ron-Zewi (Technion), which will appear in FOCS 2012.