The hard science of Sociology

On Monday, the first talk I managed to attend was by Terry Sejnovski, of the Salk Institute, and I only got there towards the end of the talk. Starting from the question of how the brain manages to “do” vision, he talked about fascinating experiments that, among other things, test the rationality of conscious versus unconscious decisions (apparently, in complex decision-making involving lots of variables, “gut feelings” are better than conscious reasoning) and measure the surprisingly good “computational” ability of monkeys.

In the afternoon, Valiant delivered his talk on “accidental algorithms” to a standing-room-only packed audience. His main results are (1) a $\oplus P$-completeness result for $\oplus$-2SAT, even when restricted to planar instances, and (2) a polynomial time algorithm for $mod_7$-2SAT on planar instances. $\oplus P$ is the complexity class of problems that reduce to the question of whether a given non-deterministic Turing machine, on a given input, has an odd number of accepting computations. By the Valiant-Vazirani result, a $\oplus P$-complete problem is also NP-hard under randomized reductions, and, by Toda’s theorem, is actually hard for all the levels of the polynomial hierarchy. The $\oplus$-2SAT problem is to determine whether a given 2SAT instance has an odd number of satisfying assignments. Apparently, this is the first natural $\oplus P$-complete problem for which there is a polynomial time algorithm that checks if the number of solutions is zero or non-zero. The $mod_7$-2SAT problem is, given a 2SAT formula, to determine if the number of assignment is or is not a multiple of 7. Valiant’s algorithm is based on his earlier work on matchgate computations.

Afterwards, most people moved to Uri Feige’s talk in the next room on finding witnesses of unsatisfiability for random 3CNF formulas. Previously, it was known how to find, with high probability, witnesses of unstatisfiability for a given random 3CNF formula with $n$ variables and about $n^{1.5}$ clauses. This new work by Feige, Kim and Ofek shows that witnesses of unsatisfiability exist for most random 3CNF formulas with $n$ variables and about $n^{1.4}$ clauses. (They do not give an efficient algorithm to find such witnesses.)

Nicholas Harvey gave the last talk of the day on his work on reducing the matching problem to matrix multiplication. Previous work had shown that matching can be solved in the same time as matrix multiplication, but the previous reduction was quite complex for general graphs. (And simpler for bipartite graphs.) Harvey gives a considerably more transparent new algorithm that reduces matching in general graphs to matrix multiplication. His talk was very clear and enjoyable, and I am grateful to Moses Charikar for recommending it. (Because of the word “matroid” in the title, I was going to skip to the talk.)

On Tuesday, Jon Kleinberg gave a spectacular invited talk on the kind of radically new results on sociology that one can extract from new datasets (resulting from people using large online retailers, joining online communities such as livejournal and myspace, and so on) and from algorithmic thinking. I did not take notes, and I could not find the slides online, so I can only offer a few snippets that I remember. In one of Jon’s most celebrated results, he studied an idealized model of “social networks” made of a mix of “local” connections and of random “long-distance” connections, and he showed that, in the resulting network, people would be able to find short paths among themselves only if the long-distance connections have a probability that is proportional the inverse of the square of the distance between two people. In the model, the local connections are laid out according to a grid, but the same result holds in a more general model where the local connections define a metric space, and the random connections are such that the probability that $i$ and $j$ are connected is proportional to the inverse of the number of people that are closer to $i$ than $j$ is. A study made on livejournal data shows that the “friends” that people make online are distributed in a way that is in spooky agreement with the above distribution.

Why do people make friends with precisely the only distribution that makes it easy to find short paths between themselves? The answer is unclear, but the outstanding thing is that a “law of human nature” seems to have been discovered, which could have never been formulated without thinking in terms of algorithms. Presumably, any explanation of this phenomenon will have to involve an algorithmic model of how people make connections online.

Another interesting question is the effect of “peer pressure” or community involvment. We are more likely to do something if our friends also do it (forgetting for a moment the question of what is the cause and what is the effect, that is whether we make friends with like-minded people, or whether we are influenced by what others do), but what exactly is the dependency between how many friends do something and how likely we are to do the same? Is there a “diminishing returns” curve, the way we are impressed by the first few friends who buy an ipod, and hence more and more likely to buy one ourselves, and then less and less impressed? Or is it a “critical mass” curve, the way skype is useless unless a lot of friends of ours are using it? There has been amusing work studying this question by looking at communities on livejournal and conferences on dblp. One can see how likely is a person to join a livejournal community as a function of the number of friends who are already members of the community, or see how likely is a person to publish for the first time in a conference as a function of former coauthors who have already published in the conference. The two curves are reasonably similar, and both are of “diminishing returns” type.

So one is more likely to, say, start publishing in a conference the more people one knows who are part of the community of that conference. Now, we can ask, given that one knows the same number of people in the community, does it make it more or less likely to start publishing there if the people know each other (say, have published together in the past), or if they do not know each other? To see the same question in other settings, if we hear a rumor, we are more likely to believe it if we hear it from more people. Furthermore, all other things being equal, we are more likely to believe it if we hear it from unrelated people, because it gives some kind of independent confirmation. If we are deciding whether to go to a party, we are also more likely to go the more people we know who are also going. In this case, however, all other things being equal, we are more likely to go if the people that we know who are going also know each other.

In the case of conferences, the former scenario is right. One is more likely to publish in a conference if the former coauthors who are part of the community are unrelated. Similarly, less “densely clustered” communities are the ones that are more likely to grow. There is a cause-and-effect problem here, in the sense that a community where everybody knows everybody else might be a mature and declining one, and being unappealing for such a reason; another explanation is that densely clustered community appear cliquish to outsiders, and are unappealing for such a reason.

There was more and more, in the talk. I have often heard that when Biology will become a “hard science” with a strong mathematical foundation, the mathematics will look familiar to those working in electrical engineering and in algorithms, and that theoretical computer science will be a major contributor to this future theoretical biology. What I did not know is that Sociology itself can become a hard science, and that theoretical computer science can play a major role in this transformation.

Congratulations, Jon!

There is a long line to enter the conference center this morning. Only at the end, I realize that invited speakers (but of course!) can skip the line and stride in. The line is because of security checks. Everybody passes through a metal detector, and bags are checked. So much security for a math conference? (Later we understand why.)

We start with a string trio (guitar, violin and cello) playing live. The sound of the violin is butchered by the sound system. Then the VIPs come on stage. Chairing the award cerimony is His Majesty Juan Carlos I, King of Spain. This is not going to be like a STOC business meeting. The King is accompanied by a person in a white military uniform with lots of stripes on his shoulders. The King sits in the middle of a table on stage, flanked by congress organizers and Spanish politicians. (The minister of Research and the mayor of Madrid.) The guy with the white uniform takes a seat behind the King. Then, one by one, everybody gets up and gives a speech.

John Ball starts his speech by explaining how mathematicians talk about their work freely, without fear that their work will be stolen, and how work is appreciated solely based on its merits, not on the way it is promoted. This is how the vast majority of mathematicians work, he continues, and exceptions are rare, and noted. It might be a reference to some of the recent events around the proof of the Poincare conjecture.

The mayor of Madrid starts what seems like a canned speech that he always gives at scientific/ technological events. Towards the end, however, he talks quite eloquently about the virtues of mathematics, its ability to make sense of a complex world, its commitment to truth, its trascendence of religion, race, and so on. Finally, he says, it is not just mathematicians that have to make an effort to come closer to the everyman and to explain the practical applications of mathematics. It is everybody’s civic duty to learn more about math and science. (He said it better.)

Griffiths starts by saying “One of the main activities of the IMU [the International Mathematical Union] in the last few years has been the selection of a new logo.” Hilarity ensues in the audience. (He seemingly did not mean it as a joke.) A documentary on the new logo is played.

More and more people talk, and at long last we come to the award of the Fields Medals. As expected, Terry Tao and Grisha Perelman receive the award, plus two other mathematicians that work in areas that I am not familiar with.

Perelman, is announced, has declined the Fields Medal. Someone from the back claps.

The Nevanlinna prize goes to Jon Kleinberg. This is definitely the computer science award with the most distinguished record, and I want to congratulate the committee for, once more, making an excellent choice. Congratulations to Jon too, of course.