Over the past four decades, Avi Wigderson, figuratively, wrote the book on theoretical computer science. Now he has literally done so. I can’t wait for the movie adaptation.
Böhm was one of the founding fathers of Italian computer science. His dissertation, from 1951, was one of the first (maybe the first? I don’t know the history of these ideas very well) examples of a programming language with a compiler written in the language itself. In the 1950s and 1960s he worked at the CNR (an Italian national research institution with its own technical staff), in the IAC (Institute for the Applications of Computing) directed by mathematician Mauro Picone. IAC was the second place in Italy to acquire a computer. In 1970 he moved to the University of Turin, were he was the founding chairman of the computer science department. In 1972 he moved to the Sapienza University of Rome, in the Math department, and in 1989 he was one of the founders of the Computer Science department at Sapienza. He remained at Sapienza until his retirement.
Böhm became internationally known for a 1966 result, joint with Giuseppe Jacopini, in which he showed, roughly speaking, that programs written in a language that includes goto statements (formalized as flow-charts) could be mapped to equivalent programs that don’t. The point of the paper was that the translation was “structural” and the translated program retained much of the structure and the logic of the original program, meaning that programmers could give up goto statements without having to fundamentally change the way they think.
Dijkstra’s famous “Go To Statement Considered Harmful” 1968 letter to CACM had two references, one of which was the Jacopini-Böhm theorem.
Böhm was responsible for important foundational work on lambda calculus, typed functional languages, and the theory of programming languages at large.
He was a remarkable mentor, many of whose students and collaborators (including a notable number of women) became prominent in the Italian community of theory of programming languages, and Italian academia in general.
In the photo above is Böhm with Simona Ronchi, Betti Venneri and Mariangiola Dezani, who all became prominent Italian professors.
You may also recognize the man on the right as a recent recipient of the Turing Award. Silvio Micali went to Sapienza to study math as an undergrad, and he worked with Böhm, who encouraged Silvio to pursue his PhD abroad.
I studied Computer Science at Sapienza, starting the first year that the major was introduced in 1989. I remember that when I first met Böhm he reminded me of Doc Brown from Back to the Future: a tall man with crazy white hair, speaking of wild ideas with incomprehensible technical terms, but with unstoppable enthusiasm.
One year, I tried attending a small elective class that he was teaching. My, probably imprecise, recollection of the first lecture is as follows.
He said that one vertex is a binary tree, and that if you connect two binary trees to a new root you also get a binary tree, then he asked us, how would you prove statements on binary trees by induction? The class stopped until we would say something. After some consultation among us, one of the smart kids proposed “by induction on the number of vertices?” Yes, said Böhm, that would work, but isn’t there a better way? He wanted us to come up by ourselves with the insight that, since binary trees have a recursive definition, one can do induction on the structure of the definition.
In subsequent lectures, we looked (without being told) at how to construct purely functional data structures. I dropped the class after about a month.
In which we go over a more powerful (but difficult to compute) alternative to the spectral norm, and discuss how to approximate it.
Today we’ll discuss a solution to the issue of high-degree vertices distorting spectral norms, which will prepare us for next lecture’s discussion on community detection in the stochastic block model using SDP. We’ll discuss a new kind of norm, the infinity-to-one norm, and find an efficient way to approximate it using SDP.
Scribed by Chinmay Nirkhe
In which we explore the Stochastic Block Model.
1. The problem
The Stochastic Block Model is a generic model for graphs generated by some parameters. The simplest model and one we will consider today is the problem.
Definition 1 ( graph distribution) The distribution is a distribution on graphs of vertices where is partitioned into two 2 subsets of equal size: . Then for pair of vertices in the same subset, and otherwise .
We will only consider the regime under which . If we want to find the partition , it is intuitive to look at the problem of finding the minimum balanced cut. The cut has expected size and any other cut will have greater expected size.
Our intuition should be that as , the problem only gets harder. And for fixed ratio , as , the problem only gets easier. This can be stated rigorously as follows: If we can solve the problem for then we can also solve it for where , by keeping only edges and reducing to the case we can solve.
Recall that for the -planted clique problem, we found the eigenvector corresponding to the largest eigenvalue of . We then defined as the vertices with the largest values of and cleaned up a little to get our guess for the planted clique.
In the Stochastic Block Model we are going to follow a similar approach, but we are instead going to find the largest eigenvalue of . Note this is intuitive as the average degree of the graph is . The idea is simple: Solve the largest eigenvector corresponding to the largest eigenvalue and define
Scribed by Luowen Qian
In which we use spectral techniques to find certificates of unsatisfiability for random -SAT formulas.
Given a random -SAT formula with clauses and variables, we want to find a certificate of unsatisfiability of such formula within polynomial time. Here we consider as fixed, usually equal to 3 or 4. For fixed , the more clauses you have, the more constraints you have, so it becomes easier to show that these constraints are inconsistent. For example, for 3-SAT,
- In the previous lecture, we have shown that if for some large constant , almost surely the formula is not satisfiable. But it’s conjectured that there is no polynomial time, or even subexponential time algorithms that can find the certificate of unsatisfiability for .
- If for some other constant , we’ve shown in the last time that we can find a certificate within polynomial time with high probability that the formula is not satisfiable.
The algorithm for finding such certificate is shown below.
- Algorithm 3SAT-refute()
- if 2SAT-satisfiable( restricted to clauses that contains , with )
- if 2SAT-satisfiable( restricted to clauses that contains , with )
- return UNSATISFIABLE
We know that we can solve 2-SATs in linear time, and approximately
clauses contains . Similarly when is sufficiently large, the 2-SATs will almost surely be unsatisfiable. When a subset of the clauses is not satisfiable, the whole 3-SAT formula is not satisfiable. Therefore we can certify unsatisfiability for 3-SATs with high probability.
In general for -SAT,
- If for some large constant , almost surely the formula is not satisfiable.
- If for some other constant , we can construct a very similar algorithm, in which we check all assignments to the first variables, and see if the 2SAT part of the restricted formula is unsatisfiable.
Since for every fixed assignments to the first variables, approximately
portion of the clauses remains, we expect the constant and the running time is .
So what about ‘s that are in between? It turns out that we can do better with spectral techniques. And the reason that spectral techniques work better is that unlike the previous method, it does not try all the possible assignments and fails to find a certificate of unsatisfiability.
2. Reduce certifying unsatisfiability for k-SAT to finding largest independent set
2.1. From 3-SAT instances to hypergraphs
Given a random 3-SAT formula , which is an and of random 3-CNF-SAT clauses over variables (abbreviated as vector ), i.e.
where , and no two are exactly the same. Construct hypergraph , where
is a set of vertices, where each vertex means an assignment to a variable, and
is a set of 3-hyperedges. The reason we’re putting in the negation of is that a 3-CNF clause evaluates to false if and only if all three subclauses evaluate to false. This will be useful shortly after.
First let’s generalize the notion of independent set for hypergraphs.
An independent set for hypergraph is a set that satisfies .
If is satisfiable, has an independent set of size at least . Equivalently if the largest independent set of has size less than , is unsatisfiable. Proof: Assume is satisfiable, let be a satisfiable assignment, where . Then is an independent set of size . If not, it means some hyperedge , so and the -th clause in evaluates to false. Therefore evaluates to false, which contradicts the fact that is a satisfiable assignment.
We know that if we pick a random graph that’s sufficiently dense, i.e. the average degree , by spectral techniques we will have a certifiable upper bound on the size of the largest independent set of with high probability. So if a random graph has random edges, we can prove that there’s no large independent set with high probability.
But if we have a random hypergraph with random hyperedges, we don’t have any analog of spectral theories for hypergraphs that allow us to do this kind of certification. And from the fact that the problem of certifying unsatisfiability of random formula of clauses is considered to be hard, we conjecture that there doesn’t exist a spectral theory for hypergraphs able to replicate some of the things we are able to do on graphs.
However, what we can do is possibly with some loss, to reduce the hypergraph to a graph, where we can apply spectral techniques.
2.2. From 4-SAT instances to graphs
Now let’s look at random 4-SATs. Similarly we will write a random 4-SAT formula as:
where , and no two are exactly the same. Similar to the previous construction, but instead of constructing another hypergraph, we will construct just a graph , where
is a set of vertices and
is a set of edges.
If is satisfiable, has an independent set of size at least . Equivalently if the largest independent set of has size less than , is unsatisfiable. Proof: The proof is very similar to the previous one. Assume is satisfiable, let be a satisfiable assignment, where . Then is an independent set of size . If not, it means some edge , so and the -th clause in evaluates to false. Therefore evaluates to false, which contradicts the fact that is a satisfiable assignment.
From here, we can observe that is not a random graph because some edges are forbidden, for example when the two vertices of the edge has some element in common. But it’s very close to a random graph. In fact, we can apply the same spectral techniques to get a certifiable upper bound on the size of the largest independent set if the average degree , i.e. if , we can certify unsatisfiability with high probability, by upper bounding the size of the largest independent set in the constructed graph.
We can generalize this results for all even ‘s. For random -SAT where is even, if , we can certify unsatisfiability with high probability, which is better than the previous method which requires . The same is achievable for odd , but the argument is significantly more complicated.
2.3. Certifiable upper bound for independent sets in modified random sparse graphs
Despite odd ‘s, another question is that in this setup, can we do better and get rid of the term? This term is coming from the fact that spectral norm break down when the average degree . However it’s still true that random graph doesn’t have any large independent sets even when the average degree is constant. It’s just that the spectral norm isn’t giving us good bounds any more, since the spectral norm is at most . So is there something tighter than spectral bounds that could help us get rid of the term? Could we fix this by removing all the high degree vertices in the random graph?
This construction is due to Feige-Ofek. Given random graph , where the average degree is some large constant. Construct by taking and removing all edges incident on nodes with degree higher than where is the average degree of . We denote for the adjacency matrix of and for that of . And it turns out,
With high probability, .
It turns out to be rather difficult to prove. Previously we saw spectral results on random graphs that uses matrix traces to bound the largest eigenvalue. In this case, it’s hard to do so because the contribution to the trace of a closed walk is complicated by the fact that edges have dependencies. The other approach is that given random matrix , we will try to upper bound . A standard way for this is to that for every solution, count the instances of in which the fixed solution is good, and argue that the number of the fixed solutions is small, which tells us that there’s no good solution. The problem here is that the set of solutions is infinitely large. So Feige-Ofek discretize the set of vectors, and then reduce the bound on the quadratic form of a discretized vector to a sum of several terms, each of which has to be carefully bounded.
We always have
and so, with high probability, we get an polynomial time upper bound certificate to the size of the independent set for a random graph. This removes the extra term from our analysis of certificates of unsatisfiability for random -SAT when is even.
3. SDP relaxation of independent sets in random sparse graphs
In order to show a random graph has no large independent sets, a more principled way is to argue that there is some polynomial time solvable relaxation of the problem whose solution is an upper bound of the problem.
Let SDPIndSet be the optimum of the following semidefinite programming relaxation of the Independent Set problem, which is due to Lovász:
Since it’s the relaxation of the problem of finding the maximum independent set, for any graph . And this relaxation has a nice property.
Proof: First we note that SDPIndSet is at most
and this is equal to
which is at most
Finally, the above optimization is equivalent to the following
which is at most the unconstrained problem
Recall from the previous section that we constructed by removing edges from , which corresponds to removing constraints in our semidefinite programming problem, so , which is by theorem 3 at most with high probability.
4. SDP relaxation of random k-SAT
From the previous section, we get an idea that we can use semidefinite programming to relax the problem directly and find a certificate of unsatisfiability for the relaxed problem.
Given a random -SAT formula :
The satisfiability of is equivalent of the satisfiability of the following equations:
Notice that if we expand the polynomial on the left side, there are some of the monomials having degree higher than 2 which prevents us relaxing these equations to a semidefinite programming problem. In order to resolve this, and we introduce . Then we can relax all variables to be vectors, i.e.
For example, if we have a 4-SAT clause
we can rewrite it as
For this relaxation, we have:
- If , the SDP associated with the formula is feasible with high probability, where for every fixed .
- If , the SDP associated with the formula is not feasible with high probability, where is a constant for every fixed even , and for every fixed odd .
Scribed by Jeff Xu
In which we discussed planted clique distribution, specifically, we talked about how to find a planted clique in a random graph. We heavily relied upon our material back in lecture 2 and lecture 3 in which we covered the upper bound certificate for max clique in . At the end of this class, we wrapped up this topic and started the topic of -SAT.
1. Planted Clique
To start with, we describe a distribution of graphs with a planted clique. Suppose that we sample from and we want to modify s.t. it has a size clique, i.e., we have a clique with . The following code describes a sampler for the distribution.
- Pick a subset of vertices from s.t.
- Independently for each pair , make an edge with probability
Note: We are only interested in the case , which is the case in which the planted clique is, with high probability, larger than any pre-existing clique
This Saturday, a good way to spend your time before the reception at the Simons Institute is to attend the FOCS’17 workshops. Three workshops will be running in parallel, on sketching, on distribution testing, and on query and communication complexity.
(Photo credit: ACM)
Formally ending a search started in March 2016 (and a process started in the Fall of 2015), we are pleased to finally officially announce that Shafi Goldwasser will take over from Dick Karp as director of the Simons Institute for Computing on January 1st, and will return to Berkeley after a 30+ year hiatus.
Shafi is the co-inventor and developer of the notions semantic security in encryption; of zero-knowledge proofs; of pseudorandom functions; of the connection between PCP and hardness of approximation; and of property testing in sublinear algorithms, among others. She has received the Turing award for her work on cryptography and of two Gödel prizes for her work on complexity.
I cannot put in words how happy I am for the Berkeley community, including myself, and for the future of the Institute.
The director search was my first glimpse into how the Berkeley central campus bureaucracy operates, and it was horrifying. The simplest thing couldn’t be done without a sequence of authorities signing off on it, and each authority had a process for that, which involved asking for other things that other authorities had to sign off on, and so on in what at times seemed actual infinite descent.
The announcement linked above was in the works for at least three weeks!
Alistair Sinclair, after two terms as associate director of the Simons Institute, during which his heroic efforts were recognized with the SIGACT service award, also retired from his position at the Institute, and last July 1st was replaced by Berkeley professor Peter Bartlett, a noted pioneer of the study of neural networks.
This weekend, on Saturday, the Simons Institute will host the FOCS reception, which will double as celebration for Alistair’s prize. There will buses leaving the conference hotel at 6:45pm, and there will be plenty of food (and drinks!) at the Institute. There will also be buses taking people back to the hotel, although once you are in downtown Berkeley on a Saturday evening (bring a sweater) you may want to hang out a bit more and then take a rideshare service back to the hotel.
Scribed by Haaris Khan
In which we study the SDP relaxation of Max Cut in random graphs.
1. Quick Review of Chernoff Bounds
Suppose are mutually independent random variables with values . \newline Let . The Chernoff Bounds claim the following: \newline
3. When we do not know , we can bound as follows:
2. Cutting a Near-Optimal Number of Edges in Via SDP Rounding
Consider where . We show that with probability, the max-degree will be
- Fix v
- For some constant c,
Next, we compute the number of vertices that participate in a triangle. Recall that degree can be bounded by
If a vertex participates in a triangle, there are ways of choosing the other two vertices that participate with v in the triangle. \newline So the expected number of vertices in triangles can be bounded by
So with probability,
- All vertices have degree
- vertices participate in triangles.
3. Eigenvalue Computations and SDP
Problems like finding the largest / smallest eigenvalue can be solved using SDP
Let be symmetric, be the largest eigenvalue of M: We can formulate this as Quadratic Programming:
We showed previously that we can relax a Quadratic Program to SDP:
In fact, it happens that these two are equivalent. To show this, we must show that a vector solution of SDP can hold as a solution to the QP and vice versa.
Proving for QP is valid for SDP: Trivial. Any solution to our Quadratic Program must be a solution for our SDP since it is a relaxation of the problem; then the optimum of our QP must be less than or equal to the optimum of our SDP
Proving for SDP is valid for QP: Consider . We note that our SDP can be transformed into an unconstrained optimization problem as follows:
The cost c can be defined as the value of our solution:
We get a one-dimensional solution when we use the element of , and wish to find the that maximizes this.
We use the following inequality:
4. SDP Max-Cut: Spectral Norm as a SDP Certificate
Consider the SDP relaxation of Max-Cut on Graph :
Let the optimum value for this SDP be . It’s obvious that . Under our constraints, we can rewrite our SDP as
So our new optimization problem is
We can relax our constraint to the following: . Relaxing our constraint will yield an optimization problem with a solution less than the stricter constraint (call this ):
Clearly, we have the following inequalities: . We can rewrite as
Note that our objective function computes the largest eigenvalue of :
For every graph with ,
Recall from previous lectures that for , the adjacency matrix of sampled from has with high probability. This implies that . Semantically, this means that computes in poly-time a correct upper-bound of .
5. Trace and Eigenvalues
Suppose matrix is symmetric with eigenvalues . The following are true:
- eigenvalues are
Then, for .
is defined as the number of expected paths from to that take steps (not necessarily simple paths in a graph)
Our goal with this is to compute the eigenvalues . Since traces relates the sum of the diagonal and the sum of eigenvalues for symmetric , we can use this to provide an upper bound for symmetric .
(Photo from facebook.com)
Michael Cohen, one the most brilliant young minds of our field, recently passed away in Berkeley.
After going to MIT for college, Michael worked for Facebook and was a graduate student at MIT. This semester, he was at Berkeley as Simons Fellow in connection with the program on optimization at the Simons Institute.
In a few short years, Michael left his mark on a number of problems that are close to the heart of in theory‘s readers.
He was part of the team that developed the fastest algorithm for solving systems of linear equations in which the matrix of constraints is a graph Laplacian (or, more generally, is symmetric and diagonally dominated), running in time where is the number of non-zero entries of the matrix and is the number of variables.
He also worked on matrix approximation via subsampling, on algorithms that approximate random walk properties, on algorithms for flow and shortest paths, and on geometric algorithms.
My favorite result is his single-author paper giving a polynomial time construction of bipartite Ramanujan graphs of all degree and all sizes, making the approach of Marcus, Spielman and Srivastava constructive.
Michael was a unique person, who gave a lot to our community and had touched several lives. His loss is an unspeakable tragedy that I still find very hard to process.