“

Young man, in mathematics you don’t understand things. You just get used to them.” John von Neumann.

Continuing the discussion on what does it mean to understand a proof, and following up on the discussion on a recent post by Gowers, I would like to bring up the question of what does it mean for a mathematical proof to be *difficult*.

First of all, I think it is important, again, to distinguish between the perspective of the person *conceiving* the proof and the person *reading* (or trying to *understand*) the proof.

**Proofs that are hard to read**

From the perspective of the reader, a proof may seem hard (or at least *harder than it needs to be*) simply because of the style of exposition, as already discussed.

Even some flawlessly presented papers, however, can be hard to follow, because of the complexity of the technical machinery used by the author. In many areas that have developed very sophisticated tools (PCP constructions, for example), every paper might look very complicated to the non-expert, because of the need to “interface” with the very complicated context, or the need to re-prove “standard” results because they were never stated exactly in the way that is needed and so on. To the expert, however, many such paper look relatively simple, in the sense that given the basic ideas, which can often be summarized in a few sentences, it is “routine” to reconstruct the rest of the paper. (More later on what this means for the value of such papers.)

Finally, there are well-presented papers which baffle the experts (sometimes, this includes the authors). I am thinking, for example, of the Khot-Vishnoi paper on sparsest cut. Those are papers that introduce new techniques that are not yet fully understood. (Meaning, for example, that the full extent of their applicability and generality is not clear yet.) They contain the genuinely difficult proofs, and they are the ones that are most useful to study. Note that, here, I am considering difficulty as a transient, not an intrinsic property: at some point the techniques become used over and over, a theory is built around them, we know how to make use of them, and they become “understood,” or “simple” to the experts. Needless to say, this process of clarification and simplification is extremely useful, and should be well regarded and rewarded, as it usually is. (Most of my research is of this type, and I cannot complain for the way it is received.)

**Proofs that are hard to conceive**

What about difficulty from the perspective of authors? Clearly, papers like the ones I just described are as difficult (or more) to come up with than to understand for the experts. One is devising new techniques and exploiting them on the fly, and it is a miracle when it all works out.

What about papers that seem difficult to non-experts and are easy (when properly presented) to experts? Here it depends. Sometime, given the right tools, the result is just routine application of the technical machinery, no harder to conceive than to verify. Sometimes, however, even if the main idea of the paper can be expressed in one sentence, that one idea can be the result of endless attempts, blind alleys, complicated first attempts, later simplifications, and so on. Of course, papers of the first type are nearly (but not completely) useless, and papers of the second type are very good.

**The value of technical work**

There was recently a discussion in the theory community about the value of “conceptual” papers, which, perhaps unpredictably, turned in part into a discussion on the value of “technical” papers. At the time, I completely agreed (as I still do) with statements such as “all other things being equal, a simpler proof is better than a harder one,” but I had a problem with some of the conclusions that were being drawn.

Leaving aside the issue of papers whose main contribution is a new definition or a new model (which are, in fact, the “conceptual” papers that the original statement dealt with), when we look at a paper proving a new result and introducing a new idea, then, all other things being equal, we want the authors the explain their idea and distill their proof in the simplest possible way. However, another point is also true: that given two well presented papers, both simplified as much as they can be, the more complicated one is the one that uses the more complex technical tools, and a new ideas about complex machinery is more valuable than a new ideas about simpler machinery, because the former makes powerful tools even more powerful. And, as argued above, the technical papers which are going to be more useful (by stimulating the construction of new theoretical thinking around their tools) are the *mystifying* ones.

In conclusion:

- writing papers just because you can, employing difficult techniques in routine ways: BAD
(but even such papers may be somewhat useful, I may return to this point in a later post.)

- presenting a needlessly complicated version of a potentially simple argument: VERY BAD
- having a new idea on how to use difficult techniques, and explaining it as transparently as possible: GOOD
(but such papers would still be hard to read for the non-experts)

- making breakthroughs by conceiving new ways of doing things: VERY GOOD
- finding the “right” way to understand a previously mystifying argument: GOOD

(And, of course, coming up with a new definition or model that extends the reach of theoretical work: VERY EXCELLENT; but this was not the subject of this post.)

## 12 comments

Comments feed for this article

January 8, 2009 at 5:15 pm

Ari RabkinI can’t help thinking of that Gilbert and Sullivan line:

And ev’ry one will say,

As you walk your mystic way,

“If this young man expresses himself in terms too deep for me,

Why, what a very singularly deep young man

this deep young man must be!”

January 9, 2009 at 9:25 am

randomwalkerI dug up an old post of mine on this topic that might be somewhat interesting..

January 9, 2009 at 9:15 pm

semi“However, another point is also true: that given two well presented papers, both simplified as much as they can be, the more complicated one is the one that uses the more complex technical tools, and a new ideas about complex machinery is more valuable than a new ideas about simpler machinery, because the former makes powerful tools even more powerful.”

Perhaps this is why complexity gets more coverage at STOC/FOCS. No one wants a complicated algorithm, because it is simply not useful. So it is not true for all of TCS. But since people think it’s true for everything, they end up being biased towards certain areas.

January 10, 2009 at 1:19 am

Gil KalaiThis is an interesting post; but I the conclusions are unclear and to the extent I understand them I disagree.

“In conclusion:

writing papers just because you can, employing difficult techniques in routine ways: BAD

(but even such papers may be somewhat useful, I may return to this point in a later post.)”

I think in some sense we all write papers “just because we can.” its actually important to be able to employ routinely difficult techniques; it tends to make them over time less difficult. (Still, maybe this statement refers to something I can agree with, but I dont know what.)

“presenting a needlessly complicated version of a potentially simple argument: VERY BAD”

What do you mean by “needlessly”? The important value of “prove the damn conjecture at all costs” sometimes lead to proofs which are needlessly complicated. I do not think it is a common practice to present proofs in a needlessly difficult ways. (Sometimes people try to extend the scope of the theorem on the expense of making the proofs more difficult.)

Maybe people should be given more incentives to simplify, but once you finally prove what you want to prove it is often difficult to see how to simplify.

“making breakthroughs by conceiving new ways of doing things: VERY GOOD ”

I think it is hard to disagree that breakthroughs are very good. But don’t we give too much weight to “conceiving new ways of doing things” per se?

“(And, of course, coming up with a new definition or model that extends the reach of theoretical work: VERY EXCELLENT; but this was not the subject of this post.)”

So is this better than all of the above?

January 10, 2009 at 1:27 am

About Mathematics « Combinatorics and more[...] a link to an interesting post regarding proofs and understanding from “in theory”; And another one; And another one by Arvind Narayanan. [...]

January 10, 2009 at 1:36 am

Gil“all other things being equal, a simpler proof is better than a harder one,”

It is very difficult to disagree with this statement like with

“all other things being equal I will buy the cheaper TV”

The problem is that one way to evaluate the theorem is through the difficulty of the proof (and similarly the price of the TV gives us some signal about its quality.) So in some sense it is never the case that “all other things being equal”.

Also, in practice, when people and their proofs are evaluated I am not sure that the rule “the simpler the better” really applies.

January 10, 2009 at 2:11 am

lucaDear Gil, thanks for your comments, which make me realize that I haven’t expressed myself very clearly.

I completely agree with what you say that “it’s actually important to be able to employ routinely difficult techniques; it tends to make them over time less difficult” and this is in fact the subject of a planned future post. Here, responding in part to the discussion that followed the “statement on conceptual contributions”, I wanted to concede, for the sake of argument, the points that routine technical work is not the most useful, and that entirely new conceptual work is very valid, and I wanted to focus on what are the kinds of technical work that I consider most useful.

The remark on “needlessly complicated” refers to the hypothetical scenario in which the authors know of a simple (at least at an informal level) way of explaining their argument, or at least could be reasonably expected to come up with such a simple explanation, but fail to discuss said simple explanation and instead just report the technical argument. This scenario has sometimes been used as a strawman argument against difficult papers.

As for Gil’s second comment, I think that the point is that “all other things are never equal,” and, in particular, results in certain areas tend to come with arguments that are more complex, which should not be taken as a negative (in fact, I try to argue that it is typically a positive). I want, however, to acknowledge the validity of the appreciation for simplicity which was one of the starting points of the statements on conceptual contributions.

January 10, 2009 at 3:18 am

Or Meir“papers whose main contribution is a new definition or a new model (which are, in fact, the “conceptual” papers that the original statement dealt with)”

I do not think that this is the only kind of papers that the original statement dealt with, and I think that the common view of “conceptual papers” as “papers whose main contribution is a new definition or a new model” caused the whole discussion on “conceptual versus technical” to go to the wrong direction. Citing the statement:

“by “conceptual” we mean the aspects that can be communicated succinctly, with a minimum amount of technical notation, and yet their content reshapes our view/understanding.”

“conceptual aspect “along the way” may be an innovative way of modeling, looking at, or manipulating a known object or problem, including establishing a new connection between known objects/problems.”

Note that according to the above lines, conceptual papers may also be the papers that perform the process you described of “clarification and simplification”. In particular, your paper on extractors and pseudorandom generators is a purely conceptual paper according to the above lines of the statement, even though its main emphasis is not on giving a new definition or a new model.

In other words, I think that the “conceptual papers” that the original statement referred to is all the papers that you described as the papers that “look relatively simple, in the sense that given the basic ideas, which can often be summarized in a few sentences, it is “routine” to reconstruct the rest of the paper”.

January 11, 2009 at 2:47 pm

eigo“However, another point is also true: that given two well presented papers, both simplified as much as they can be, the more complicated one is the one that uses the more complex technical tools, and a new ideas about complex machinery is more valuable than a new ideas about simpler machinery, because the former makes powerful tools even more powerful.”

Are you trying to say that your recent work on the smallest eigenvalue and the Max-Cut is more valuable than GW’s work? Your paper is certainly more complicated, but it would be pretty arrogant (and likely incorrect) to assume that your work is more valuable.

January 12, 2009 at 1:41 am

GilWhat a terrible comment, eigo, to end such a nice post and thread of comments.

It is interesting to discuss the general question of judging and comparing the values of contributions. Probably there are many criteria and points of views, and a lot of genuine uncertainty.

January 14, 2009 at 4:29 am

Boaz BarakI am a bit doubtful that an abstract discussion on the value of papers is useful. There are three separate questions: is the paper good for you to READ, is it good to PUBLICIZE for the community as a whole, and is it a GOOD paper in itself. (The latter two questions are not necessarily the same thing – for example there could be papers that contain very little original research contribution but elucidate highly complex previous work so well that one would want to accept them to a conference.) All these questions should be considered on a case-by-case basis, without following some rigid rules or manifestos. (Of course, unless you are a PC member or reading/writing a reference letter, there’s generally no reason for you to waste time thinking on the latter two questions.)

I often recommend to students that they read highly complicated papers (as long as they are also good and important complicated papers of course). I think reading these can be especially beneficial to students because:

1) Obviously, your technical skills improve more as a result of reading a complicated papers than a simple one.

2) Complexity serves as a “barrier to entry”. Many people won’t have time to deeply read a complex paper, even if it’s very important. This includes many experts who may feel they get the general idea, and put reading the paper on their neverending “todo list”… This means that if you do make the effort and read the paper, you are pretty uniquely well-positioned to make further contributions.

3) Often a paper is complicated because the author threw “the kitchen sink” at the problem, using any tool he/she could think of. So, in one paper you get to see all the tricks that they accumulated in many years of work.

4) Another reason a paper is complicated is because the authors themselves still don’t understand the problem very well. This is a sign that there’s still room for new improvements and discoveries in this area. Someone coming with a fresh mind could perhaps see things the authors have missed.

Of course you only get these benefits if you read the paper deeply, thinking hard on how to recast the proof in your own language, whether each trick or complication is necessary, and whether or not a completely different approach should be taken. It’s not always easy to decide, whether a paper is worth such an investment, but in any case time spent reading a paper is always better than time spent reading (or commenting) in a blog… :)

–Boaz

January 14, 2009 at 1:50 pm

lucaeigo, way to completely subvert what I said, which is that if we were to look at two papers proving the same results, then the one with the simpler proof is usually superior (but both are valuable: it is always good to do things in different ways, because different proofs can be generalized in different ways, and trying to understand a common generalization of two seemingly incomparable approaches can be vary useful).

But we don’t typically compare papers proving the same result in different ways: we look at papers that are making advances on different problems by introducing new ideas and applying them to known machinery. In this case, of two similarly innovative ideas, the one that innovates the more powerful machinery is usually the one that is more useful. So (the point was that) whereas

for a specific problemthe best paper is usually the simplest paper, this would be a bad rule of thumb to apply across the board. And one should not be surprised if many of the papers in STOC/FOCS are highly technical. (In the case of algorithmic work, this applies mostly to theanalysisof the algorithm.)Also, I wholeheartedly agree with Boaz’s last paragraph.