FOCS 2013 Recaps, Part 1

Editor’s note: don’t forget to add the blog to your favorite RSS reader (link: https://mittheory.wordpress.com/feed/), and follow our twitter account!

The Foundations of Computer Science — or FOCS for short — conference needs no introduction. Along with STOC and SODA, it is one of the juggernaut meetups in the world of theoretical computer science. This year it was held at the Berkeley Marina, which had the benefit that in those spare moments in between talks, we could meander outside the hotel and soak in the rather pleasant view of the Berkeley yacht club, with misted downtown San Francisco far in the distance.

IMG_20131026_174606

Of course, we weren’t there for (just) the pretty views and warm weather: from workshops on “real stable polynomials” (including their application to solving the Kadison-Singer Problem!), to OSNAPs in numerical linear algebra, to nearly maximum flows in nearly linear time, there was plenty of theory to had.

If for some unfortunate reason you weren’t able to attend this year, then I regretfully must inform you that you missed a concert featuring Christos Papadimitriou and the Positive Eigenvalues! But not all is lost, because several students who did attend have selflessly agreed to share their “best of” recaps of FOCS 2013, to tell you about their favorite papers and a brief summary of the talks.

Today, Matt Weinberg will tell us about “A Polynomial Time Algorithm for Lossy Population Recovery”, “The Simple Economics of Approximately Optimal Auctions”, and “Interlacing Families I: Bipartite Ramanujan Graphs of All Degrees”. Next time Clément Canonne and Michael Forbes will share their recaps.

Recaps by Matt Weinberg

A Polynomial Time Algorithm for Lossy Population Recovery
Ankur Moitra and Michael Saks

This paper studies the following problem: imagine that species in a population can be represented as k-bit strings. Then each k-bit string has some probability of occurring in the population. Given access to samples from the population, one would like to recover these probabilities. If this is the model, then the obvious simple algorithm is the best one can do. Instead, the paper assumes that the samples are corrupted in the following way: each bit of the sample is replaced with a “?” independently with probability p. One would still like to use these corrupted samples to learn the population. The main result of this paper is an algorithm that learns the distribution within epsilon and requires only a polynomial number of samples (in k and \epsilon^{-1}) for any constant corruption probability p > 0.

Here are the high level ideas of the proof:

  • There’s a known reduction that says if you can approximately recover the probability of the all-zero string, then you can approximately recover the probability of all strings.
  • Let b be a vector where b_i denotes the probability that an uncorrupted sample from the population has i zeroes. Let also c be a vector where c_i denotes the probability that a corrupted sample from the population has i zeroes. Then there is an easily computable linear transformation A such that Ab = c, where A_{ij} is just the probability that exactly i of j ones remain after corruption. So we can also compute A^{-1}. We can then take corrupted samples to estimate c, and estimate b = A^{-1}c. Unfortunately, the coefficients in A^{-1} are so big that we’d need an exponentially good approximation to c in order to get any meaningful approximation to b by this approach.
  • Instead, they add some error matrix E to A^{-1}. Doing so has two effects: one, it changes the magnitude of terms in A^{-1} and two, it adds additional error to the recovered b_0 term. They write an LP to see if any error matrix exists that both makes the terms of A^{-1} + E small enough so that they don’t need too many samples, and doesn’t add too much additional error to the recovered term.
  • They show that the dual of this LP corresponds to searching for a certain kind of polynomial. Any such polynomial that exists implies an upper bound on how good an error matrix can be.
  • Using tools from complex analysis (specifically, Hadamard Three-Circle Theorem), they show that no polynomial exists that can imply a strong upper bound. Therefore, there are good error matrices and their LP will find one.
  • So in summary, their algorithm takes samples to estimate this vector c, where c_i is the probability that a corrupted sample has exactly i ones. Then, they solve their LP to find a good error matrix E. Then they compute (A^{-1} + E)c to recover a decent approximation to the vector b, where b_i is the probability that an uncorrupted sample had exactly i ones. Then they plug this into the reduction that we mentioned in the beginning.

The Simple Economics of Approximately Optimal Auctions
Saeed Alaei, Hu Fu, Nima Haghpanah, and Jason Hartline

This paper studies the following problem: A salesman has multiple goods to sell to m unit-demand buyers. There may be some feasibility constraints on which agents can simultaneously receive goods. The seller wishes to auction the goods in a way that maximizes his revenue. With only a single good to sell, an interpretation of Myerson’s optimal auction by Bulow and Roberts shows that the optimal auction has a particularly simple form and can be interpreted as maximizing “marginal revenue” (discussed below). The main result of this paper is a characterization of optimal auctions with multiple goods as maximizing marginal revenue as well.

To understand their result, first consider the single item case. When selling a single item to a single buyer, it’s well known that the best thing to do is set a take-it-or-leave-it price p, that the buyer will choose to take with probability q. For any q in [0,1], one can define \mathrm{REV}(q) to be the revenue obtained by choosing a price that the buyer takes with probability exactly q. One can also define the marginal revenue to be \mathrm{REV}'(q). The Bulow-Roberts interpretation of Myerson’s optimal auction says exactly that the optimal auction for m buyers is to ask buyers to report bids b_i, map bids to probabilities q_i (the probability that buyer i would purchase the item at price b_i), and then give the item to the agent maximizing \mathrm{REV}'(q_i). In other words, always give the item to the agent that with the maximum marginal revenue.

In the multi-item case, not much is known even about how to sell many items to a single agent, so it’s not even clear that an analogue of marginal revenue makes sense. The paper proposes defining \mathrm{REV}(q) to be the maximum revenue that could possibly be obtained by selling the items to a single agent in a way such that the agent will receive something with probability exactly q. Specialized to the single-item case, this recovers exactly the original definition. In addition, they give a natural mapping from bids (which are now vectors) to probabilities, and define the marginal revenue mechanism to first map bids to probabilities q_i, then select the feasible set of agents with the maximum value of \mathrm{REV}'(q_i) to be awarded an item. The item they receive is exactly what they would get in the single-agent scheme that defines \mathrm{REV}(q_i). Unlike the single-item setting, the marginal revenue mechanism may not always be optimal. But the main result of the paper shows that the mechanism is in fact optimal if a natural condition (called revenue-linearity) on the input instance holds. Their result is robust to approximation as well: if their condition only holds approximately, then the mechanism is approximately optimal. They also give some specific instances of multi-dimensional settings where revenue-linearity holds exactly and approximately.

Interlacing Families I: Bipartite Ramanujan Graphs of All Degrees (Best Paper Award)
Adam Marcus, Daniel Spielman and Nikhil Srivastava

A d-regular graph is said to be Ramanujan if all of its non-trivial eigenvalues (the ones that aren’t d or -d) are at most 2\sqrt{d-1} in absolute value. This paper addresses the question of whether or not an infinite sequence of d-regular Ramanujan graphs exist for all d > 2, and shows that in fact such a sequence exists for all d > 2 of bipartite graphs.

They obtain their result by proving a conjecture of Bilu and Linial on the existence of certain kinds of graphs. Specifically, the conjecture is that every d-regular graph has a signing (that is, a weighting of all edges to be either +1 or -1), such that all eigenvalues of the signed graph are at most 2 \sqrt{d-1} in absolute value. The paper proves a restricted version of this conjecture for bipartite graphs. Once this conjecture is proved, existing machinery by Bilu and Linial yields the main theorem. Below is a high level overview of the proof:

  • Consider randomly assigning each edge to be +1 or -1 independently with probability 1/2. One could then compute the expected largest eigenvalue of this procedure and hope that it’s at most 2\sqrt{d-1}. Unfortunately, this isn’t the case. Instead, they compute the expected characteristic polynomial. That is, the average over all 2^{|E|} signings of the characteristic polynomial of that signed adjacency matrix.
  • It turns out that this polynomial has already been computed as the matching polynomial by Godsil and Gutman, and it is the matching polynomial of the original graph. That is, the coefficient of x^{n-2i} is the (-1)^i times number of matchings with exactly i edges.
  • Now that they have the expected polynomial, they want to compute its largest root. Fortunately, it turns out that this has also already been done: Heilmann and Lieb showed that all roots are real and that the largest root is at most 2\sqrt{d-1}.
  • So now they have to relate the roots of the expected characteristic polynomial to roots of actual signings. So the question they address is: When do the roots of an average polynomial tell you anything about the roots of those being averaged? The answer is usually “absolutely nothing”, but the main technical result of this paper is a sufficient condition for the answer to be “something useful.” They give a definition called interlacing families of polynomials, and show that for all interlacing families, there exists a polynomial in the group whose largest root is no larger than the largest root of the averaged polynomial.
  • The penultimate step is showing that the characteristic polynomials of each signing forms an interlacing family (which seems quite involved). With this, they’ve now shown that there exists a signing with a characteristic polynomial whose largest eigenvalue is no larger than the largest eigenvalue of the expected polynomial, which is at most 2\sqrt{d-1}. The last step is observing that if G is bipartite, then all eigenvalues come in pairs of the form \{\lambda, -\lambda\}. Therefore, every bipartite graph has a signing with all eigenvalues having absolute value at most 2\sqrt{d-1}.
Advertisements

2 thoughts on “FOCS 2013 Recaps, Part 1

  1. FOCS 2013 Recaps, Part 2 – Not so Great Ideas in Theoretical Computer Science

  2. Welcome! – Not so Great Ideas in Theoretical Computer Science

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s