Edited 8/16 to correct attributions to Alon and Boppana and to Tardos, thanks to an anonymous commenter for the correction
Yesterday, Norbert Blum posted a preprint with a claimed proof that . An incorrect comment that I made last night and corrected this morning caused a lot of confusion, so let me apologize by summarizing the claims in the paper.
Coincidentally, this week, there is an Oberwolfach workshop on proof complexity and circuit complexity, so I am confident that by the end of the week we will hear substantive comments on the technical claims in the paper.
Recall that if a decision problem is solvable by an algorithm running in time on inputs of length then, for every , it is also solved, on inputs of length , by a circuit of size . Thus, if a problem is solvable in polynomial time it is also solvable by a family of polynomial size circuits (or, in short, it has polynomial circuit complexity).
The paper claims that Clique has exponential circuit complexity, and hence it has no polynomial time algorithm and . The argument leverages known results from monotone circuit complexity.
A monotone circuit is a boolean circuit that has only AND gates and OR gates, but does not have any NOT gates. A decision problem is a monotone problem if, for every fixed input length, it is solvable by a monotone circuit. For example, the problem to decide if a given -vertex graph has a clique of size at least is a monotone problem (if the input graph is presented as a boolean adjacency matrix), and so is the problem of deciding if a given graph has a perfect matching.
In the 1980s, Razborov proved that Clique cannot be computed by polynomial size monotone circuits. Later Andreev proved that there is a monotone problem in NP that requires exponential size monotone circuits, and Tardos Alon and Boppana proved that Clique itself requires exponential size monotone circuits. At the time, it was conjectured that if a monotone problem is in P, then it is solvable by a family of polynomial size monotone circuits. Under this conjecture, Razborov’s result would imply that Clique is not in P, and hence .
Unfortunately, Razborov refuted this conjecture, by showing that the perfect matching problem, which is in P, does not have polynomial size monotone circuits. Tardos showed that the Alon-Boppana exponential lower bound for clique holds for any monotone function sandwiched between the clique number and the chromatic number of a graph, including the Lovasz Theta function. Since the Theta function is polynomial time computable, and hence has polynomial size circuits, this shows that the gap between monotone circuit complexity and general circuit complexity can be exponentially large. (See first comment below.)
Razborov’s proof of the Clique lower bound for monotone circuits introduced the powerful approximation method. Roughly speaking, his approach was to start from a hypothetical polynomial size family of monotone circuits for Clique, and, from that, build a family of approximating circuits, which are just DNF formulas. The approximating circuits constructed by his method do not solve the Clique problem in general, but they solve a “promise” version of the problem, that is, they solve Clique on a certain subset of graphs. Then Razborov finishes the argument by showing that the approximating circuits are so simple that it is impossible for them to even solve Clique on that subset of inputs, thus reaching a contradiction to the assumption that there are monotone polynomial size circuits for Clique. The approximation method, variously modified, was also used to prove the lower bounds for Andreev’s problem and for matching.
Tim Gowers wrote a wonderful exposition of Razborov’s method, trying to show how one would come up with the various ideas, rather than just presenting the proof step by step.
Berg and Ulfberg simplify the proofs of Razborov, Andreev and Tardos for Clique and Andreev’s problem (but not for matching) by showing how to construct an approximator that has both small DNF complexity and small CNF complexity. The stronger claim makes an inductive argument in the construction easier to establish.
(At this point, I should clarify that I have never properly studies these results, so I am probably getting some details wrong. Please post corrections in the comments.)
The main claim in Blum’s paper is Theorem 6, which claims that if polynomial-size monotone circuits for a monotone problem admit a “CNF-DNF approximator” for a promise restriction of the problem (like the Berg-Ulfberg one), then also general circuits for the same problem admit such an approximator. Thus, if the approximator does not exist, it not only follows that the monotone complexity of the problem is super-polynomial, but also the general circuit complexity of the problem is superpolynomial.
Together with the Berg-Ulfberg approximator for monotone circuits for Clique, this implies that Clique is not in P.
Now, what could possibly go wrong with this argument?
- What about natural proofs?
This argument can only applied to (certain) monotone problems, and monotone problems are a vanishing fraction of all problems, hence one does not have the “largeness” property of natural proofs, and the argument is not a natural proof. (This is noted in the paper.)
- What about relativization and algebrization?
The argument starts from a boolean circuit for a given problem. If one is given an oracle circuit, with gates that answer oracle queries, or give evaluation of a polynomial extension of the problem, the argument cannot be applied.
- But this argument is lifting monotone circuit lower bounds to general circuit lower bounds, and so what about perfect matching, which has an exponential monotone circuit lower bound and a polynomial general circuit upper bound?
It is not known how to make the known lower bound for matching work via a “CNF-DNF approximator” and the claims in the paper only concern monotone lower bounds proved in this way. (Edited to add: but what about the Lovasz theta function?)
- But didn’t Razborov prove that the approximation method cannot prove a better-than-quadratic lower bound for general circuits?
Like any no-go theorem, Razborov’s impossibility result makes some assumption on what it means to “apply the approximation method to general circuits” and Blum claims that the assumptions do not apply to his argument.
- But where is the “heavy lifting” done? Shouldn’t every proof of a major result have one or more steps that are unlike anything done before?
I don’t have a good answer to this question. All the work is done in the proof of Theorem 6, which is the construction of the approximator starting from an arbitrary circuit. Maybe the argument is right and the heavy lifting was in Razborov’s work and subsequent extension and simplifications, and the fact that one can handle NOT gates at the bottom can be handled with an argument that is syntactically similar to previous work. Or, something goes wrong in the construction. Either way we will probably know soon.
A small correction: it was Alon and Boppana who showed that the monotone circuit size of the clique function must be exponential. Tardos observed that Razborov’s and Alon-Boppana’s arguments carry over to a function which is computed by a polynomial size non-monotone circuit (the function is a small variant on approximating the Lovasz theta function of the graph). If Berg and Ulfberg’s arguments also apply for Tardos’ function (which is intuitively likely, as their proof seems to be based on Razborov’s proof) then it is clear that that Blum’s current claim cannot be correct. Unfortunately, the author does not discuss this point.
Pingback: Mi opinión sobre la demostración P≠NP de Norbert Blum | Ciencia | La Ciencia de la Mula Francis
Pingback: New top story on Hacker News: On Norbert Blum’s claimed proof that P does not equal NP – The Internet Yard
The fact that the argument starts from a (standard) circuit, and not an oracle-aided one, does not necessary mean one cannot modify it to consider a circuit with oracle gates. Doesn’t this have to be shown explicitly? Is there a clear reason (or step)why one cannot just go through steps in the proof and replace every circuit with an oracle-aided one?
Both theorem 7 and theorem 8 of blum’s.paper have a constraint in them. E.g. s <= m^(2/3) in theorem 7. What is the interpretation of this constraint?
I'm somewhat confused as to whether the restrictions for approximation he imposes on CNF/DNF approximator in page 18 are valid, AND sufficiently optimal, in particular the choices for the constants, which he seemed to choose pretty arbitrarily, though could be based on analysis in the referred papers. I think for validity of the argument requires the choice to not restrict the approximator too much or it would only prevent expansion to non-monotone or P=NP. From how I read it, if this restriction is too strict, the approach cannot prove generality. If it's too lax, the approximator is not different from general CNF/DNF transformation and fails time limitation. And being in middle of those like in the paper triggers split to true fiber and false fiber (test inputs) in way that's against axiom of excluded middle and perhaps tarski's undefinability theorem. Which might be ok since this is limited to NP, and those are not but I'm not understanding here how those constants could be chosen. Why would the test inputs be sufficient to cover the expansion to non-monotone. I don't understand well the lower limit results from the referred papers.
Your review seems very useful. Otherwise I would be generally sympathetic to the question “where the heavy lifting” if a proposed proof were completely trivial. But this one is surely not completely trivial. So the question about “heavy lifting” seems to unjustifiably assume that the lifting should be “very heavy”. However, before one actually gets familiar with a proof, one doesn’t know “how heavy” the lifting should be. Blum’s paper (plus those he builds upon) may very well be “exactly appropriately heavy” as needed to prove that P isn’t NP, cannot it? 😉 P versus NP has often been associated with nearly religious content – its proof must also cause the Second Coming of Jesus Christ, among other things, but do you agree that there isn’t really any mathematical, logical, or even rational probabilistic evidence for this assumption?
Luca, there seems to be a bit of sourness in your tone. Why? Relax and and have some sweet grapes for lunch today!
Pingback: Comp sci world shock: Bonn boffin proposes P≠NP proof, preps for prestige, plump prize | Crawfordwise
Just for the sake of linking things across the Internet: there is an ongoing discussion about this on CSTheory.SE. https://cstheory.stackexchange.com/questions/38803/where-is-norbert-blums-2017-proof-that-p-ne-np-being-discussed/38823
@anon asks: “Is there a clear reason (or step)why one cannot just go through steps in the proof and replace every circuit with an oracle-aided one?”
Some of the fundamental steps in the paper make use of the circuit structure as a network of logical gates (AND or OR or NOT), and specify what to do in each case. To try to extend the proof to a circuit also containing oracle gates, you would have to add a new case to each of those steps about “what to do for an oracle gate”. It is not even clear what kind of new case could make sense there (given the rest of the proof), let alone a specific new case to use.
Thus the “burden of proof” certainly rests on anyone wanting to claim “relativization shows this proof can’t work (even though I’m not pointing to any incorrect step), because it could be applied to a circuit containing oracle gates”, to make that claim more detailed by showing *how* to extend the proof to oracle gates — not on the author, who can’t be expected to rule out every conceivable way of extending the proof to such circuits. We can expect the author to rule out the obvious ways, but he’s already implicitly done that (IMHO) because there are no obvious ways.
It’s important to understand that the “barriers” (natural proofs, relativization, algebrization) don’t have the power to “make an otherwise correct proof wrong”. If they applied to a correct proof, it would either mean that *their own* proofs are wrong (unlikely) or there is a contradiction in our axioms (far more unlikely). If they applied to a correct-*seeming* proof, the most likely reason would be that that proof is not *really* correct. But what they are meant to be (and are) used for is as a shortcut to checking the proof, or an entire class of possible proofs someone is working on — sometimes they allow you to rule out a proof strategy’s correctness without having to check the details, just because it fits a certain structure.
But they say nothing about proofs that don’t fit that structure. It can be subtle or nonobvious whether a proof can be seen as fitting it or not, but if a proof really doesn’t look like it fits that structure, the “burden of proof” in checking it against that barrier would be on someone wanting to show that it does fit it (in some nonobvious way).
I agree with Lubos Motl that the second coming of Jesus Christ is not a necessary condition for a correct proof that P is different from NP. I am keeping an open mind as to whether it is a sufficient condition.
Dammit. OK, I am spitting that small piece of flat bread out right now, then. Not worth it.
I think I found a bug.
Let DNF’ be the reduction operator for DNFs, defined in pages 27-28 as follows:
(i) remove trivial clauses (a clause is trivial if it contains both a variable and its negation);
(ii) in each non-trivial clause where there are both positive and negative variables, remove the negative variables.
The way I understand the paragraph in page 29 just before theorem 5, the following is claimed to hold by induction: Let D1,D2 be two DNFs. Then:
DNF'(DNF'(D1) AND DNF'(D2)) = DNF'(D1 AND D2).
But this is false, eg for D1=x1 AND ~x2, D2=x2.
Then, it’s easy then to construct an example of a network where the two definitions of computing DNF’ are not equivalent, say:
g1=x1, g2=~x2, g3=x2, g4=g1 AND g2, g5=g4 AND g3
It seems to me that the proof in section 6 heavily relies on the two definitions of DNF’ being equivalent.
Any thoughts?
@Shachar do you mind if I reproduce your question (with attribution) on the TCS.SE post above, to increase the chances of getting an answer?
@shachar the problem is that the definition of reduction is so vague and ill-defined that it isn’t possible to know just what is meant by it.
It is not obvious whether we could transform the decision version of Lovasz theta function into a monotone function. I guess we can hardly find the desired CNF/DNF approximator for Lovasz theta function to refute Theorum 6.
Ok, now I *really* need to make my existence interesting enough, so I can write an autobiography and say “I attended a course taught by that guy!!” at some point.
This said, I am really really excited to see where this thing is going to go – with or without the Second Coming (or First, for a big part of the CS community, I think).
Thanks a lot for your exposition, Luca!
@Yijie Tardos’ function does exactly that (transform the Lovasz function to a monotone function), look at her paper.
@Gustav thanks! Tardos’ transformation is not so obvious to me without looking at the paper 🙂 I think the obstacle to construct the approximator is that the original positive tests can tolerate larger perturbation on the output gate causing small total error, although the error of each gate can still be bounded as before.
This is utterly wrong, an error has been detected in Blum’s paper by the honorable von Neumann. He shall publish it soon.
Alfred?
Shachar: I think you’re right. (Definitions are cumbersome and unclear, so no certainties…)
Without negations, prime implicants are nice and the theorem on DNFs holds.
“Or, something goes wrong in the construction. Either way we will probably know soon.” this is a sort of passive attitude. an assumption or several assumptions. its disappointing. honestly you may be one of the biggest world experts on this subj and youre passing the buck. nothing personal! understand its hard to find errors in very large/ complex papers like this even by experts. who is a top world expert thats on top of all the related material? these people are rare, like those who understood relativity when it was first described by einstein. it can take quite awhile to pinpoint errors even by experts. and who knows if they will seriously engage with it. havent really seen any so far. (thx for your effort to write this blog!)
there was a somewhat similar attempt by Hauptmann over a year ago, hes at the same university, the experts generally had no comment online, there was no reaction. it looks like this proof in contrast has gone viral due to some different mix of elements. maybe experts like you commenting!
https://vzn1.wordpress.com/2016/06/20/hauptmann-p%e2%89%a0np-attack-via-polynomial-hierarchy-noncollapse/
❓ 💡 another thought/ conjecture: after many yrs of study in this area, it does seem to me sometimes like maybe there is some reformulation/ reduction of P vs NP to the CNF-DNF approximation problem in the context of monotone circuits. ie something related thats provably NP complete. would like to see a small succinct statement of one. maybe it reduces to counting occurrences of test inputs as in the proofs of this type. maybe its related to sunflower problems which are like measuring lower/ upper bounds on patterns that emerge in random boolean matrices/ “set systems”. also could it maybe reduce to some problems in graph complexity that have been proven equivalent to P vs NP? https://cstheory.stackexchange.com/questions/26017/statements-that-imply-mathbfp-neq-mathbfnp/26027#26027
have skimmed blums paper and think he has a very good grasp of the literature but think his current presentation/ organization could use a lot of improvement. think he is maybe correctly focusing in on the CNF-DNF approximation machinery in the prior proofs but not (yet?) clearly elucidating the core problem/ reduction. what is the nucleus/ crux of the proof so to speak?
If we try to apply the approximation method directly to a standard network, the negated variables tend to make the error at each gate unbounded and the total error at the output gate not large enough. In fact, the positive/negative tests induce a distance measure at each gate. For the approximation method to work, we do not require such measure to be calculated w.r.t the original expression at each gate. By mapping to the monotone expression (simply ignoring negated variables), we are now working on a measure that offers similar bound condition at each gate as in monotone networks. Luckily, such new measure coincides with the original one at the output gate since the function is monotone, so we can calculate the total error as before. Therefore we can apply the original approximator for monotone network to the standard network under new measure. Similar idea was already presented in Blum’s 2009 paper On Negations in Boolean Networks, which argues that Razborov’s no-better-than-quadratic-lower-bound proof can be overcome by using a weaker measure.
Pingback: On the Edge of Eclipses and P=NP | Gödel's Lost Letter and P=NP
Isn’t it is also becoming of non-trivial significance that no white smoke has yet emanated from the (suspiciously!) serendipitously-timed gathering in Oberwolfach? It’s nearly the end of the week…
Pingback: norbert blum P vs NP attack goes CYBER-SCIENCE-VIRAL! | Turing Machine
is this related to [link]https://en.wikipedia.org/wiki/Ellipsoid_method[\link]
or to hhttps://en.wikipedia.org/wiki/Shannon_capacity_of_a_graph
Sorry, messed up the link. https://en.wikipedia.org/wiki/Shannon_capacity_of_a_graph
Norbert Blum says:
“Since the computation of an onetape Turing machine can be simulated by a non-monotone Boolean network of size at most the square of the number of steps [15, Ch. 3.9], a superpolynomial lower bound for the non-monotone network complexity of such a function would imply P≠NP”
[…]
“But until now, no one could prove a non-linear lower bound for the nonmonotone complexity of any Boolean function in NP. An obvious attempt to get a super-polynomial lower bound for the non-monotone complexity of the clique function could be the extension of the method which has led to the proof of an exponential lower bound of its monotone complexity. This is the so-called “method of approximation” developed by Razborov”
I suspect that this logic reasoning is worng.
The goal of Razborov´s method of approximation is to show that CC cannot compute ff, where ff is a function and CC any circuit of mm gates.
Does it mean that if CC cannot compute ff, the problem cannot be computed by CC?
Note that some problem could be resolved with some function that perhaps we’ll never know.
I´m not sure if I am misunderstanding the PvsNP problem. I thought that the PvsNP conjecture referes to problems not to functions.
In my work, when I was young, the computers runs at 133Mhz or less, so the efficiency of the software was very important to solve a problem and sometimes I program an algoritm that solve a problem fully different to other one that probably would be still processing the problem.
I think that each algoritm is a different function (and a different TM) and if some NP-hard or complete problem is resolved deterministacally by an algorithm in a polinomic time-cost, it implies P=NP. If I´m not misundestandig that point, until not is proved that a function ff is the best one possible to solve a problem,this ff is not enough to conclude that P≠NP.
Obviously if Blum (or any one) knows the best possible ff to solve a problem in NP-hard, we don´t need Razborov´s method to prove it, it is trivial.
What I´m missunderstanding?
Pingback: Current Discussion about Norbert Blum’s Paper “A Solution of the P versus NP Problem” – Mathematics, Cryptology, Computing …
Dear Prof Blum,
First of all, you MUST know that a correct proof of P!=NP doesn’t imply P!=NP; yes, it doesn’t imply itself. Why? Because you can still *prove* the conjecture:”P=NP”. Indeed, I strongly urge you to try to prove both conjectures:”P=NP” & “P!=NP” and present the community with two different correct proofs where none of them can disprove the other.
Dear Prof Blum,
You said:
“I´m not sure if I am misunderstanding the PvsNP problem. I thought that the PvsNP conjecture referes to problems not to functions.”
One should be precise here, P vs NP is about both problems and functions. The elements of the classes P and NP are decision problems. However, this is informally. Formally, these elements are languages which may be accepted/not accepted by a TM.
Dear Prof Blum,
You said:
“I think that each algoritm is a different function (and a different TM)”
Restated precisely: “The TM formalizes the informal notion of an algorithm for the purpose of: “Definition of the notion of a computable function”.
With my respect to you and all colleagues, particularly, the blog owner, Prof Luca Trevisan:
Rafee Kamouna.
@kamouna Could you please stop trolling here? The complexity of this discussion does not allow my brain to filter out your incoherent babbling in parallel.
The formula don’t display correctly on my end, I see them as centered images without text around them (they display as display:block)
scratch that, it works now : |
Pingback: What exactly went wrong in Blum’s proof that N is unequal NP? – Mathematics, Cryptology, Computing …
Pingback: Shtetl-Optimized » Blog Archive » HTTPS / Kurtz / eclipse / Charlottesville / Blum / P vs. NP
Pingback: P<NP | Gödel's Lost Letter and P=NP