I am grading the final projects of my class, I am trying the clear the backlog of publishing all the class notes, I am way behind on my STOC reviews, and in two days I am taking off for a complicated two-week trips involving planes, trains and a rented automobile, as well as an ambitious plan of doing no work whatsoever from December 20 to December 31.
So, today I was browsing Facebook, and when I saw a post containing an incredibly blatant arithmetic mistake (which none of the several comments seemed to notice) I spent the rest of the morning looking up where it came from.
The goal of the post was to make the wrong claim that people have been paying more than enough money into social security (through payroll taxes) to support the current level of benefits. Indeed, since the beginning, social security has been paying individuals more than they put in, and now that population and salaries have stop growing, social security is also paying out retired people more than it gets from working people, so that the “trust fund” (whether one believes it is a real thing or an accounting fiction) will run out in the 2030s unless some change is made.
This is a complicated matter, but the post included a sentence to the extent that $4,500 a year, with an interest of 1% per year “compounded monthly”, would add up to $1,3 million after 40 years. This is not even in the right order of magnitude (it adds up to about $220k) and it should be obvious without making the calculation. Who would write such a thing, and why?
My first stop was a July 2012 post on snopes, which commented on a very similar viral email. Snopes points out various mistakes (including the rate of social security payroll taxes), but the calculation in the snopes email, while based on wrong assumptions, has correct arithmetic: it says that $4,500 a year, with a 5% interest, become about $890k after 49 years.
So how did the viral email with the wrong assumptions and correct arithmetic morph into the Facebook post with the same wrong assumptions but also the wrong arithmetic?
I don’t know, but here is an August 2012 post on, you can’t make this stuff up, Accuracy in Media, which wikipedia describes as a “media watchdog.”
The post is attributed to Herbert London, who has PhD from Columbia, is a member of the Council on Foreign Relation and used to be the president of a conservative think-tank. Currently, he has an affiliation with King’s College in New York. London’s post has the sentence I saw in the Facebook post:
(…) an employer’s contribution of $375 per month at a modest one percent rate compounded over a 40 year work experience the total would be $1.3 million.
The rest of the post is almost identical to the July 2012 message reported by Snopes.
Where did Dr. London get his numbers? Maybe he compounded this hypothetical saving as 1% per month? No, because that would give more than $4 million. One does get about $1.3 million if one saves $375 a month for thirty years with a return of 1% per month, though.
Perhaps a more interesting question is why this “fake math” is coming back after five years. In 2012, Paul Ryan put forward a plan to “privatize” Social Security, and such a plan is now being revived. The only way to sell such a plan is to convince people that if they saved in a private account the amount of payroll taxes that “goes into” Social Security, they would get better benefits. This may be factually wrong, but that’s hardly the point.
Currently, when graduate students work as teaching assistants, the university waives their tuition and pays them a stipend. Under current tax law, students pay income tax “only” on their stipend. A provision in the tax bill currently under consideration would count the waived tuition as income, on which the student would have to pay taxes as well.
A calculation by a Berkeley physics graduate student (source) finds that a student who work as TA for both semesters and the summer, is payed at “step 1” of the UC Berkeley salary scale, and is a California resident, currently pays $2,229 in federal income tax, which would become $3,641 under the proposed tax plan, a 61% increase. The situation for EECS students is a bit different: they are paid at a higher scale, which puts them in a higher bracket, and they are often on a F1 visa, which means that they pay the much-higher non-resident tuition, so they would be a lot worse off (on the other hand, they usually TA at most one semester per year). The same calculation for MIT students shows a 240% tax increase. A different calculation (sorry, no link available) shows a 144% increase for a Berkeley EECS student on a F! visa.
This is one of the tax increases that go to fund the abolition of the estate tax for estates worth more than $10.9 million, a reduction in corporate tax rates, a reduction in high-income tax rates, and other benefits for multi-millionaires.
If you are a US Citizen, and if you think that graduate students should not pay for the estate tax of eight-figure estates, you should let you representative know. Usually calling, and asking to speak with the staffer responsible for tax policy, is much better than emailing or sending a physical mail. You can find the phone numbers of your representatives here.
If you have any pull in ACM, this is the kind of matter on which they might want to make a factual statement about the consequences for US computer science education, as they did at the time of the travel ban.
Scribed by Neng Huang
In which we use the SDP relaxation of the infinity-to-one norm and Grothendieck inequality to give an approximation reconstruction of the stochastic block model.
1. A Brief Review of the Model
First, let’s briefly review the model. We have a random graph with an unknown partition of the vertices into two equal parts and . Edges across the partition are generated independently with probability , and edges inside the partition are generated independently with probability . To abbreviate the notation, we let , which is the average internal degree, and , which is the average external degree. Intuitively, the closer are and , the more difficult it is to reconstruct the partition. We assume , although there are also similar results in the complementary model where is larger than . We also assume so that the graph is not almost empty.
We will prove the following two results, the first of which will be proved using Grothendieck inequality.
We note that the first result is essentially tight in the sense that for every , there also exists a constant such that if , then it will be impossible to reconstruct the partition even if an fraction of misclassified vertices is allowed. Also, the constant will go to infinity as goes to 0, so if we want more and more accuracy, needs to be a bigger and bigger constant times . When the constant becomes , we will get an exact reconstruction as stated in the second result.
Over the past four decades, Avi Wigderson, figuratively, wrote the book on theoretical computer science. Now he has literally done so. I can’t wait for the movie adaptation.
Böhm was one of the founding fathers of Italian computer science. His dissertation, from 1951, was one of the first (maybe the first? I don’t know the history of these ideas very well) examples of a programming language with a compiler written in the language itself. In the 1950s and 1960s he worked at the CNR (an Italian national research institution with its own technical staff), in the IAC (Institute for the Applications of Computing) directed by mathematician Mauro Picone. IAC was the second place in Italy to acquire a computer. In 1970 he moved to the University of Turin, were he was the founding chairman of the computer science department. In 1972 he moved to the Sapienza University of Rome, in the Math department, and in 1989 he was one of the founders of the Computer Science department at Sapienza. He remained at Sapienza until his retirement.
Böhm became internationally known for a 1966 result, joint with Giuseppe Jacopini, in which he showed, roughly speaking, that programs written in a language that includes goto statements (formalized as flow-charts) could be mapped to equivalent programs that don’t. The point of the paper was that the translation was “structural” and the translated program retained much of the structure and the logic of the original program, meaning that programmers could give up goto statements without having to fundamentally change the way they think.
Dijkstra’s famous “Go To Statement Considered Harmful” 1968 letter to CACM had two references, one of which was the Jacopini-Böhm theorem.
Böhm was responsible for important foundational work on lambda calculus, typed functional languages, and the theory of programming languages at large.
He was a remarkable mentor, many of whose students and collaborators (including a notable number of women) became prominent in the Italian community of theory of programming languages, and Italian academia in general.
In the photo above is Böhm with Simona Ronchi, Betti Venneri and Mariangiola Dezani, who all became prominent Italian professors.
You may also recognize the man on the right as a recent recipient of the Turing Award. Silvio Micali went to Sapienza to study math as an undergrad, and he worked with Böhm, who encouraged Silvio to pursue his PhD abroad.
I studied Computer Science at Sapienza, starting the first year that the major was introduced in 1989. I remember that when I first met Böhm he reminded me of Doc Brown from Back to the Future: a tall man with crazy white hair, speaking of wild ideas with incomprehensible technical terms, but with unstoppable enthusiasm.
One year, I tried attending a small elective class that he was teaching. My, probably imprecise, recollection of the first lecture is as follows.
He said that one vertex is a binary tree, and that if you connect two binary trees to a new root you also get a binary tree, then he asked us, how would you prove statements on binary trees by induction? The class stopped until we would say something. After some consultation among us, one of the smart kids proposed “by induction on the number of vertices?” Yes, said Böhm, that would work, but isn’t there a better way? He wanted us to come up by ourselves with the insight that, since binary trees have a recursive definition, one can do induction on the structure of the definition.
In subsequent lectures, we looked (without being told) at how to construct purely functional data structures. I dropped the class after about a month.
Scribed by David Dinh
In which we go over a more powerful (but difficult to compute) alternative to the spectral norm, and discuss how to approximate it.
Today we’ll discuss a solution to the issue of high-degree vertices distorting spectral norms, which will prepare us for next lecture’s discussion on community detection in the stochastic block model using SDP. We’ll discuss a new kind of norm, the infinity-to-one norm, and find an efficient way to approximate it using SDP.
Scribed by Chinmay Nirkhe
In which we explore the Stochastic Block Model.
1. The problem
The Stochastic Block Model is a generic model for graphs generated by some parameters. The simplest model and one we will consider today is the problem.
Definition 1 ( graph distribution) The distribution is a distribution on graphs of vertices where is partitioned into two 2 subsets of equal size: . Then for pair of vertices in the same subset, and otherwise .
We will only consider the regime under which . If we want to find the partition , it is intuitive to look at the problem of finding the minimum balanced cut. The cut has expected size and any other cut will have greater expected size.
Our intuition should be that as , the problem only gets harder. And for fixed ratio , as , the problem only gets easier. This can be stated rigorously as follows: If we can solve the problem for then we can also solve it for where , by keeping only edges and reducing to the case we can solve.
Recall that for the -planted clique problem, we found the eigenvector corresponding to the largest eigenvalue of . We then defined as the vertices with the largest values of and cleaned up a little to get our guess for the planted clique.
In the Stochastic Block Model we are going to follow a similar approach, but we are instead going to find the largest eigenvalue of . Note this is intuitive as the average degree of the graph is . The idea is simple: Solve the largest eigenvector corresponding to the largest eigenvalue and define
Scribed by Luowen Qian
In which we use spectral techniques to find certificates of unsatisfiability for random -SAT formulas.
Given a random -SAT formula with clauses and variables, we want to find a certificate of unsatisfiability of such formula within polynomial time. Here we consider as fixed, usually equal to 3 or 4. For fixed , the more clauses you have, the more constraints you have, so it becomes easier to show that these constraints are inconsistent. For example, for 3-SAT,
The algorithm for finding such certificate is shown below.
We know that we can solve 2-SATs in linear time, and approximately
clauses contains . Similarly when is sufficiently large, the 2-SATs will almost surely be unsatisfiable. When a subset of the clauses is not satisfiable, the whole 3-SAT formula is not satisfiable. Therefore we can certify unsatisfiability for 3-SATs with high probability.
In general for -SAT,
Since for every fixed assignments to the first variables, approximately
portion of the clauses remains, we expect the constant and the running time is .
So what about ‘s that are in between? It turns out that we can do better with spectral techniques. And the reason that spectral techniques work better is that unlike the previous method, it does not try all the possible assignments and fails to find a certificate of unsatisfiability.
2. Reduce certifying unsatisfiability for k-SAT to finding largest independent set
2.1. From 3-SAT instances to hypergraphs
Given a random 3-SAT formula , which is an and of random 3-CNF-SAT clauses over variables (abbreviated as vector ), i.e.
where , and no two are exactly the same. Construct hypergraph , where
is a set of vertices, where each vertex means an assignment to a variable, and
is a set of 3-hyperedges. The reason we’re putting in the negation of is that a 3-CNF clause evaluates to false if and only if all three subclauses evaluate to false. This will be useful shortly after.
First let’s generalize the notion of independent set for hypergraphs.
An independent set for hypergraph is a set that satisfies .
If is satisfiable, has an independent set of size at least . Equivalently if the largest independent set of has size less than , is unsatisfiable. Proof: Assume is satisfiable, let be a satisfiable assignment, where . Then is an independent set of size . If not, it means some hyperedge , so and the -th clause in evaluates to false. Therefore evaluates to false, which contradicts the fact that is a satisfiable assignment.
We know that if we pick a random graph that’s sufficiently dense, i.e. the average degree , by spectral techniques we will have a certifiable upper bound on the size of the largest independent set of with high probability. So if a random graph has random edges, we can prove that there’s no large independent set with high probability.
But if we have a random hypergraph with random hyperedges, we don’t have any analog of spectral theories for hypergraphs that allow us to do this kind of certification. And from the fact that the problem of certifying unsatisfiability of random formula of clauses is considered to be hard, we conjecture that there doesn’t exist a spectral theory for hypergraphs able to replicate some of the things we are able to do on graphs.
However, what we can do is possibly with some loss, to reduce the hypergraph to a graph, where we can apply spectral techniques.
2.2. From 4-SAT instances to graphs
Now let’s look at random 4-SATs. Similarly we will write a random 4-SAT formula as:
where , and no two are exactly the same. Similar to the previous construction, but instead of constructing another hypergraph, we will construct just a graph , where
is a set of vertices and
is a set of edges.
If is satisfiable, has an independent set of size at least . Equivalently if the largest independent set of has size less than , is unsatisfiable. Proof: The proof is very similar to the previous one. Assume is satisfiable, let be a satisfiable assignment, where . Then is an independent set of size . If not, it means some edge , so and the -th clause in evaluates to false. Therefore evaluates to false, which contradicts the fact that is a satisfiable assignment.
From here, we can observe that is not a random graph because some edges are forbidden, for example when the two vertices of the edge has some element in common. But it’s very close to a random graph. In fact, we can apply the same spectral techniques to get a certifiable upper bound on the size of the largest independent set if the average degree , i.e. if , we can certify unsatisfiability with high probability, by upper bounding the size of the largest independent set in the constructed graph.
We can generalize this results for all even ‘s. For random -SAT where is even, if , we can certify unsatisfiability with high probability, which is better than the previous method which requires . The same is achievable for odd , but the argument is significantly more complicated.
2.3. Certifiable upper bound for independent sets in modified random sparse graphs
Despite odd ‘s, another question is that in this setup, can we do better and get rid of the term? This term is coming from the fact that spectral norm break down when the average degree . However it’s still true that random graph doesn’t have any large independent sets even when the average degree is constant. It’s just that the spectral norm isn’t giving us good bounds any more, since the spectral norm is at most . So is there something tighter than spectral bounds that could help us get rid of the term? Could we fix this by removing all the high degree vertices in the random graph?
This construction is due to Feige-Ofek. Given random graph , where the average degree is some large constant. Construct by taking and removing all edges incident on nodes with degree higher than where is the average degree of . We denote for the adjacency matrix of and for that of . And it turns out,
With high probability, .
It turns out to be rather difficult to prove. Previously we saw spectral results on random graphs that uses matrix traces to bound the largest eigenvalue. In this case, it’s hard to do so because the contribution to the trace of a closed walk is complicated by the fact that edges have dependencies. The other approach is that given random matrix , we will try to upper bound . A standard way for this is to that for every solution, count the instances of in which the fixed solution is good, and argue that the number of the fixed solutions is small, which tells us that there’s no good solution. The problem here is that the set of solutions is infinitely large. So Feige-Ofek discretize the set of vectors, and then reduce the bound on the quadratic form of a discretized vector to a sum of several terms, each of which has to be carefully bounded.
We always have
and so, with high probability, we get an polynomial time upper bound certificate to the size of the independent set for a random graph. This removes the extra term from our analysis of certificates of unsatisfiability for random -SAT when is even.
3. SDP relaxation of independent sets in random sparse graphs
In order to show a random graph has no large independent sets, a more principled way is to argue that there is some polynomial time solvable relaxation of the problem whose solution is an upper bound of the problem.
Let SDPIndSet be the optimum of the following semidefinite programming relaxation of the Independent Set problem, which is due to Lovász:
Since it’s the relaxation of the problem of finding the maximum independent set, for any graph . And this relaxation has a nice property.
Proof: First we note that SDPIndSet is at most
and this is equal to
which is at most
Finally, the above optimization is equivalent to the following
which is at most the unconstrained problem
Recall from the previous section that we constructed by removing edges from , which corresponds to removing constraints in our semidefinite programming problem, so , which is by theorem 3 at most with high probability.
4. SDP relaxation of random k-SAT
From the previous section, we get an idea that we can use semidefinite programming to relax the problem directly and find a certificate of unsatisfiability for the relaxed problem.
Given a random -SAT formula :
The satisfiability of is equivalent of the satisfiability of the following equations:
Notice that if we expand the polynomial on the left side, there are some of the monomials having degree higher than 2 which prevents us relaxing these equations to a semidefinite programming problem. In order to resolve this, and we introduce . Then we can relax all variables to be vectors, i.e.
For example, if we have a 4-SAT clause
we can rewrite it as
For this relaxation, we have:
Scribed by Jeff Xu
In which we discussed planted clique distribution, specifically, we talked about how to find a planted clique in a random graph. We heavily relied upon our material back in lecture 2 and lecture 3 in which we covered the upper bound certificate for max clique in . At the end of this class, we wrapped up this topic and started the topic of -SAT.
1. Planted Clique
To start with, we describe a distribution of graphs with a planted clique. Suppose that we sample from and we want to modify s.t. it has a size clique, i.e., we have a clique with . The following code describes a sampler for the distribution.
Note: We are only interested in the case , which is the case in which the planted clique is, with high probability, larger than any pre-existing clique