Matrix Multiplication Not Yet Doable In Quadratic Time

Last night, a paper appeared on the arXiv titled A \Theta(n^2) time matrix multiplication algorithm.

At first look the paper seemed to score zero on Scott Aaronson’s list of Ten Signs a Claimed Mathematical Breakthrough is Wrong, and it came from a researcher who has worked extensively on the related problem of all-pairs shortest path, so it looked exciting. Unfortunately, it actually scores one on Scott’s list: it contradicts known impossibility results. (Thanks to Yuval Filmus for the references.)

The first step in the algorithm is to show how to compute the inner product of two 3^n-dimensional vectors using n^{O(1)} \cdot 2^n multiplication. This contradicts a lower bound due to Pan (see here the proof of a more general result) that computing the inner product of two N-dimensional vectors require at least N multiplications.

In addition, Raz has proved an \Omega (n^2 \log n) lower bound for matrix multiplication arithmetic circuits, and, to the best of my understanding, the paper’s claims imply an O(n^2) size arithmetic circuit. (Raz’s lower bound assumes that the circuit is not given access to large constants; as far as I can see, the formulas in Han’s paper do not require large constants.)

What’s New at the Simons Institute

This Thursday (December 15) is the deadline to apply for post-doctoral positions at the Simons Institute for the next academic year. In Fall 2017 we will run a single program, instead of two programs in parallel, on optimization. The program will be double the size of our typical programs, and will focus on the interplay between discrete and continuous methods in optimization. In Spring 2018 there will be a program on real-time decision-making and one on theoretical neuroscience.

In a few weeks, our Spring 2017 programs will start. The program on pseudorandomness will have a mix of complexity theorists and number theorists, and it will happen at the same time as a program on analytic number theory at MSRI. It will start with a series of lectures. It will start with a series of lectures in the third week of January. The program on foundations of machine learning has been much anticipated, and it will also start with a series of lectures, in the fourth week of January, which will bring to Berkeley at impressive collection of machine learning practitioners whose research is both applied and rigorous. All lectures will be streamed live, and I encourage you to set up viewing parties.

During the current semester, the Simons Institute had for the first time a journalist in residence. The inaugural resident journalist has been Erica Klarreich, that many of you probably know for her outstanding writing on mathematics and computer science for quanta magazine.

Next semester, our resident journalist will be Pulitzer-prize winner John Markoff, who just retired from the New York Times.

Avi60: Zero Knowledge for all NP

1982 was the annus mirabilis of the foundations of cryptography. In their paper “probabilistic encryption,” Goldwasser and Micali introduced two rigorous definitions of security for encryption, which they proved to be equivalent. One definition required the distributions of encryptions of any two messages to be computationally indistinguishable (a concept they introduce in the paper), the other, which they call semantic security, is the property that whatever can be efficiently computed about a message given the cyphertext can also be efficiently computed without the cyphertext. Later the same year, Blum and Micali gave a rigorous definitions of security for pseudorandom generators, and Yao wrapped all these results in a more general framework requiring generic, rather than number-theoretic, assumptions.

The concept of semantic security inspired most subsequent definitions, and proofs, of security based on the concept of simulation. Instead of trying to specify a list of things than adversary should not be able to do, one defines an idealized model in which the adversary has no access to private and encrypted data, and one defines a given system to be secure if whatever an attacker can efficiently compute given the ability of eavesdrop (and possibly mount an active attack), can also be efficiently computed in the ideal model. One then proves a system to be secure by developing a simulator in the ideal model for every real-world adversary.

Together with Rackoff, Goldwasser and Micali took this idea one step further from encryption to interactive communication, and came up with the idea of Zero-Knowledge Proofs. A zero-knowledge proof is a probabilistic proof system in which a prover can convince a verifier, with high confidence, of the truth of a statement, with the additional property that there is a simulator that is able to sample from the distribution of verifier’s views of the interaction. Thus the verifier is convinced of the truth of the statement being proved, but gains no additional information. In their paper, Goldwasser, Micali and Rackoff introduce the concept and present a zero-knowledge proof for a conjecturally intractable number-theoretic problem. The paper was famously rejected several times, eventually appearing in 1985.

The following year, Goldreich, Micali and Avi Wigderson published a paper giving zero knowledge proof systems for all problems in NP. Their work made zero-knowdge proofs a key tool in the design of secure cryptosystem: it was now possible for a party to publish a commitment to a secret x and then, at any time, be able to prove that x has a certain property without releasing any additional information about x. This ability was a key ingredient in the development of secure multi-party computation in 1987, by the same authors.

So how does one prove in zero knowledge that, say, a graph is 3-colorable? (Once you have zero-knowledge proofs for one NP-complete problems, you immediately have them for all problems in NP.)

Suppose the prover and the verifier know a graph G and the prover knows a 3-coloring. A physical analog of the protocol (which can be implemented using the notion of commitment schemes) is the following: the prover randomizes the color labels, then takes |V| lockboxes, each labeled by a vertex, and puts a piece of paper with the color of vertex v in the lockbox labeled by v, for every v. The prover locks all the lockboxes, and sends them to the verifier. The verifier picks a random edge (u,v) and asks for the keys of the lockboxes for u and for v. If they contain different colors, the verifier accepts, otherwise it rejects.

The protocol is complete, in the sense that if the graph is 3-colorable and the parties follow the protocol, then the verifier accepts with probability 1.

The protocol is sound, in the sense that if the graph is not 3-colorable, then, no matter what the prover does, there will have to some edge (u,v) such that the lockboxes of u and v are the same, and the verifier has probability at least 1/|E| of picking such an edge and rejecting. Thus the verifier accepts with probability at most 1 - 1/|E|, which can be made negligibly small by repeating the protocol several times.

As per the zero-knowledge property, the view of the verifier is the choice of a random edge, two open lockboxes corresponding to the endpoints of the edge, containing two random different colors, and |V|-2 unopened lockboxes. A view with such a distribution can be easily sampled, and the same is true when the physical implementation is replaced by a commitment scheme. (Technically, this is argument only establishes honest-verifier zero knowledge, and a bit more work is needed to capture a more general property.)


The festivities in honor of Avi Wigderson’s 60th birthday start tomorrow in Princeton, with a dream team of speakers. I will not be able to attend, but luckily a livestream will be available.

During the week, I will post a random selection of results of Avi’s.

Did you know that Avi’s first paper was an algorithm to color 3-colorable graphs using O(\sqrt n) colors? Here is the algorithm, which has the flavor of Ramsey theory proofs.

Suppose all nodes have degree < \sqrt n, then we can easily color the graph with \sqrt n colors. Otherwise, there is a node v of degree \geq \sqrt n. The neighbors of v induce a bipartite graph (because, in the 3-coloring that we are promised to exist, they are colored with whichever are the two colors that are different from the color of v), and so we can find in linear time an independent set of size \geq \sqrt n / 2. So we keep finding independent sets (which we assign a color to, and remove) of size \geq \sqrt n /2 until we get to a point where we know how to color the residual graph with \leq \sqrt n colors, meaning that we can color the whole graph with \leq 3 \sqrt n colors.

Congratulations to the 2016 Knuth Prize Selection Committee!

For the excellent choice of recognizing Noam Nisan for his work on complexity lower bounds, derandomization, and mechanism design.

Noam is known to readers of in theory for the development of the Nisan-Wigderson pseudorandom generator construction, which remains at the foundation of conditional derandomization results, and for Nisan’s generator, which is secure against log-space statistical test, and whose O(\log^2 n) seed length has not been improved upon in the past 25+ years. The modern definition of randomness extractors was made in a paper of Noam and David Zuckerman, which was also motivated by space-bounded derandomization.

Besides introducing almost all the techniques used in the main results on derandomization and pseudorandomness, Noam also introduced many of the techniques that underlie the main lower bound results that we can prove in restricted models, including the idea of approximating functions by polynomials, of looking at partial derivates to obtain artihmetic lower bounds and the connection between rank and communication complexity. With Linial and Mansour, he showed that the Hastad switching lemma could be used to bound the Fourier coefficients of functions computable by bounded-depth circuits, leading to quasi-polynomial learning algorithms for them (and to the realization that bounded-depth circuits cannot realize pseudorandom functions).

On November 27, 1989, Noam sent an email to a group of colleagues with a proof that (a decision problem equivalent to) the permanent had a multi-prover interactive proof; this set in motion a flurry of activity which led in a matter of days to the LFKN paper showing that P^{\# P} had a (one-prover) interactive proof and to Shamir’s proof that IP = PSPACE.

At the end of the 1990s, having done enough to keep the computational complexity community occupied for several subsequent decades, Noam wrote a paper with Amir Ronen called Algorithmic mechanism design. Around the same time, Elias Koutsoupias and Christos Papadimitriou published their work on worst-case equilibria and Tim Roughgarden and Eva Tardos published their work on selfish routing. A couple of years later, Christos gave back-to-back invited talks at SODA 2001, STOC 2001, ICALP 2001 and FOCS 2001 on game theory, and algorithmic game theory and algorithmic mechanism design have gone on to become a major part of theoretical computer science in the subsequent time.

Congratulations again to the prize committee, and please use the comments section to talk about the result of Noam’s that I didn’t know well enough to write about.

A conversation on what theory has done for us

[Inspired by Lance Fortnow’s retrospective post on the “Karp report,” Avi Wigderson’s response, and the Monty Python]

And what has the theory of computing done for us in the last twenty years?

Differential privacy? Apple just announced it will be used in iOS 10

Yes, and the application to preventing false discovery and overfitting is now used in production.

Ok, fine, but apart from differential privacy, what has theory done for us in the last twenty years?

Quantum algorithms? There wouldn’t be such a push to realize quantum computers if it wasn’t for Shor’s algorithm.

And quantum error correcting! There would be no hope of realizing quantum computers without quantum error correction

Very well, but apart from differential privacy and quantum computing, what has theory done for us in the …

Streaming algorithms? It all started with a theory paper and now it is a major interdisciplinary effort.

Yes, fair enough, but apart from differential privacy, quantum computing, and streaming algorithms, what has theory done for us…

Linear time decodable LDPC error-correcting codes? The first generation was not practical, but now they are part of major standards

Sure, ok, but apart from differential privacy, quantum computing, streaming algorithms, and error-correcting codes, what has theory…

Homomorphic encryption? The first-generation solutions were inefficient, but it might be only a matter of time before we have usable homomorphic encryption standards.

Linear-time SDD solvers? Algorithms like this and this are implementable and we may be one more idea away from algorithms that can be put in production.

Sublinear time algorithms like sparse FFT?

All right! But apart from differential privacy, quantum computing, streaming algorithms, error-correcting codes, homomorphic encryption, linear-time equation solvers and sub-linear time algorithms, what has the theory of computing ever done for us in the past twenty years?

. . .

[Could be continued. Indeed, please continue in the comments]

CS294 Lecture 16: Zig-Zag Graph Product

In which we give an explicit construction of expander graphs of polylogarithmic degree, state the properties of the zig-zag product of graphs, and provide an explicit construction of a family of constant-degree expanders using the zig-zag product and the polylogarithmic-degree construction.

A family of expanders is a family of graphs {G_n = (V_n,E_n)}, {|V_n|=n}, such that each graph is {d_n}-regular, and the edge-expansion of each graph is at least {h}, for an absolute constant {h} independent of {n}. Ideally, we would like to have such a construction for each {n}, although it is usually enough for most applications that, for some constant {c} and every {k}, there is an {n} for which the construction applies in the interval {\{ k, k+1, \ldots, ck \}}, or even the interval {\{ k, \ldots, ck^c\}}. We would also like the degree {d_n} to be slowly growing in {n} and, ideally, to be bounded above by an explicit constant. Today we will see a simple construction in which {d_n = O(\log^2 n)} and a more complicated one in which {d_n = O(1)}.

Continue reading

What does it mean when it’s hard to find hard instances?

[In the provincial spirit of Italian newspapers, that often have headlines like “Typhoon in South-East Asia causes widespread destruction; what are the consequences for Italian exports?”, and of men who overhear discussions about women’s issue and say things like “yes, but men have issues too,” I am going to comment on how Babai’s announcement affects me and the kind of problems I work on.]

If someone had told me last week: “a quasi-polynomial time algorithm has been found for a major open problem for which only a slightly subexponential algorithm was known before,” I would have immediately thought Unique Games!

Before Babai’s announcement, Graph Isomorphism had certain interesting properties in common with problems such as Factoring, Discrete Log, and Approximate Closest Vector (for approximation ratios of the order of sqrt (n) or more): no polynomial time algorithm is known, non-trivial algorithms that are much faster than brute force are known, and NP-completeness is not possible because the problem belongs to either NP \cap coNP or NP \cap coAM.

But there is an important difference: there are simple distributions of inputs on which Factoring, Discrete Log, and Closest Vector approximation are believed to be hard on average, and if one proposes an efficiently implementable algorithms for such problems, it can be immediately shown that it does not work. (Or, if it works, it’s already a breakthrough even without a rigorous analysis.)

In the case of Graph Isomorphism, however, it is easy to come up with simple algorithms for which it is very difficult to find counterexamples, and there are algorithms that are rigorously proved to work on certain distributions of random graphs. Now we know that there are in fact no hard instances at all, but, even before, if we believed that Graph Isomorphism was hard, we had to believe that the hard instances were rare and strange, rather than common.

It is also worth pointing out that, using Levin’s theory of average-case complexity, one can show that if any problem at all in NP is hard under any samplable distribution, then for every NP-complete problem we can find a samplable distribution under which the problem is hard. And, in “practice,” natural NP-complete problems do have simple distributions that seem to generate hard instances.

What about Small-set Expansion, Unique Games, and Unique-Games-Hard problems not known to be NP-hard, like O(1)-approximation of Sparsest Cut? We don’t know of any distribution for which it is plausible to conjecture that such problems are hard, and we have algorithms (Lasserre relaxations of constant degree) with no known counterexample. Many simple distributions of instances are rigorously solved by known algorithms. So, if we want to believe the Unique Games conjecture, we have to believe that there are hard instances, but they are rare and strange.

I am sure that it is possible, under standard assumptions, to construct an artificial problem L in NP that is in average-case-P according to Levin’s definition but not in P. Such a problem would not be polynomial time solvable, but it would be easy to solve on average under any samplable distribution and, intuitively, it would be a problem that is hard even though hard instances are rare and strage.

But can a natural problem in NP exhibit this behavior? Now that Graph Isomorphism is not a plausible example any more, I am inclined to believe (until the next surprise) that no natural problem has this behavior, and my guess concerning the Unique Games conjectures is going to be that it is false (or “morally false” in the sense that a quasipolynomial time algorithm exists) until someone comes up with a distribution of Unique Games instances that are plausibly hard on average and that, in particular, exhibit integrality gaps for Lasserre relaxations (even just experimentally).

Laci Babai and Graph Isomorphism

Next Tuesday, a week from today, Laci Babai will talk at the University of Chicago about a new algorithm that solves graph isomorphism in quasipolynomial time. There should also be a follow-up talk the following Thursday that, by a lucky coincidence, I will be able to attend, and then report back.

Meanwhile, if you have any gossip on the proof, then, by any means, go ahead and share it in the comments.