You are currently browsing the category archive for the ‘math’ category.

Suppose that we want to construct a very good family of {d}-regular expander graphs. The Alon-Boppana theorem says that the best we can hope for, from the point of view of spectral expansion, is to have {\lambda_2 \leq 2 \sqrt{d-1}}, and we would certainly be very happy with a family of graphs in which {\lambda_2 \leq O(\sqrt d)}.

Known constructions of expanders produce Cayley graphs (or sometimes Schreier graphs, which is a related notion), because it is easier to analyze the spectra of such graphs. If {\Gamma} is a group with operation {\cdot} and {a^{-1}} is the inverse of element {a}, and {S} is a symmetric set of generators, then the Cayley graph {Cay(\Gamma, S)} is the graph whose vertices are the elements of {\Gamma} and whose edges are the pairs {(a,b)} such that {a\cdot b^{-1} \in S}.

When the group is Abelian, there is good news and bad news. The good news is that the eigenvectors of the graphs are completely characterized (they are the characters of {\Gamma}) and the eigenvalues are given by a nice formula. (See here and here.) The bad news is that constant-degree Cayley graphs of Abelian groups cannot be expanders.

That’s very bad news, but it is still possible to get highly expanding graphs of polylogarithmic degree as Cayley graphs of Abelian groups.

Here we will look at the extreme case of a family of graphs of degree {d_n = n/2}, where {n} is the number of vertices. Even with such high degree, the weak version of the Alon-Boppana theorem still implies that we must have {\sigma_2 \geq \Omega(\sqrt d_n)}, and so we will be happy if we get a graph in which {\sigma_2 \leq O(\sqrt d) = O(\sqrt n)}. Highly expanding graphs of degree {n/2} are interesting because they have some of the properties of random graphs from the {G_{n,\frac 12}} distribution. In turn, graphs from {G_{n,\frac 12}} have all kind of interesting properties with high probabilities, including being essentially the best known Ramsey graphs and having the kind of discrepancy property that gives seedless extractors for two independent sources. Unfortunately, these properties cannot be certified by spectral methods. The graph that we will study today is believed to have such stronger properties, but there is no known promising approach to prove such conjectures, so we will content ourselves with proving strong spectral expansion.

The graph is the Payely graph. If {p} is a prime, {{\mathbb Z} / p {\mathbb Z}} is the group of addition modulo {p}, and {Q} is the set of elements of {{\mathbb Z}/ p{\mathbb Z}} of the form {r^2 \bmod p}, then the graph is just {Cay ( {\mathbb Z}/p{\mathbb Z}, Q)}. That is, the graph has a vertex {v} for each {v\in \{ 0,\ldots,p-1\}}, and two vertices {a,b} are adjacent if and only if there is an {r\in \{ 0,\ldots,p-1\}} such that {a-b \equiv r^2 \pmod p}.

Read the rest of this entry »

Let {G=(V,E)} be a {d}-regular graph, and let

\displaystyle  d= \lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_n

be the eigenvalues of the adjacency matrix of {A} counted with multiplicities and sorted in descending order.

How good can the spectral expansion of {G} be?

1. Simple Bounds

The simplest bound comes from a trace method. We have

\displaystyle  trace(A^2) = \sum_i \lambda_i^2

by using one definition of the trace and

\displaystyle  trace(A^2) = \sum_{v,v} (A^2)_{v,v} \geq dn

using the other definition and observing that {(A^2)_{v,v}} counts the paths that go from {v} to {v} in two steps, of which there are at least {d}: follow an edge to a neighbor of {v}, then follow the same edge back. (There could be more if {G} has multiple edges or self-loops.)

So we have

\displaystyle  dn \leq d^2 + \sum_{i=2,\ldots,n} \lambda_i ^2

and so

\displaystyle  \max_{i=2,\ldots,n} |\lambda_i | \geq \sqrt d \cdot \sqrt{\frac {n-d}{n-1}}

The condition {d \leq n(1-\epsilon)} is necessary to get lower bounds of {\Omega(\sqrt d)}; in the clique, for example, we have {d=n-1} and {\lambda_2 = \lambda_n = -1}.

A trace argument does not give us a lower bound on {\lambda_2}, and in fact it is possible to have {\lambda_2=0} and {d= n/2}, for example in the bipartite complete graph.

If the diameter of {G} is at least 4, it is easy to see that {\lambda_2 \geq \sqrt d}. Let {a,b} be two vertices at distance 4. Define a vector {x} as follows: {x_a = 1}, {x_v = 1/\sqrt d} if {v} is a neighbor of {a}, {x_b=-1} and {x_v = - 1/\sqrt d} if {v} is a neighbor of {b}. Note that there cannot be any edge between a neighbor of {a} and a neighbor of {b}. Then we see that {||x||^2 = 4}, that {x^T A x \geq 4\sqrt d} (because there are {2d} edges, each counted twice, that give a contribution of {1/\sqrt d} to {\sum_{u,v} A_{uv} x_u x_v}) and that {x} is orthogonal to {(1,\ldots,1)}.

2. Nilli’s Proof of the Alon-Boppana Theorem

Nilli’s proof of the Alon-Boppana theorem gives

\displaystyle  \lambda_2 \geq 2 \sqrt{ d-1 } - O \left( \frac {\sqrt{d-1}}{diam(G)-4} \right)

where {diam(G) \geq \frac {\log |V|}{\log d-1}} is the diameter of {G}. This means that if one has a family of (constant) degree-{d} graphs, and every graph in the family satisfies {\lambda_2 \leq \lambda}, then one must have {\lambda \geq 2 \sqrt{d -1}}. This is why families of Ramanujan graphs, in which {\lambda_2 \leq 2 \sqrt{d-1}}, are special, and so hard to construct, or even to prove existence of.

Friedman proves a stronger bound, in which the error term goes down with the square of the diameter. Friedman’s proof is the one presented in the Hoory-Linial-Wigderson survey. I like Nilli’s proof, even if it is a bit messier than Friedman’s, because it starts off with something that clearly is going to work, but the first two or three ways you try to establish the bound don’t work (believe me, I tried, because I didn’t see why some steps in the proof had to be that way), but eventually you find the right way to break up the estimate and it works.

So here is Nilli’s proof. Read the rest of this entry »

Today, after a lecture in the spectral graph theory boot camp at the Simons institute, I was asked what the expander mixing lemma is like in graphs that are not regular.

I don’t know if I will have time to return to this tomorrow, so here is a quick answer.

First, for context, the expander mixing lemma in regular graph. Say that {G=(V,E)} is a {d}-regular undirected graph, and {A} is its adjacency matrix. Then let the eigenvalues of the normalized matrix {\frac 1d A} be

\displaystyle  1 = \lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_n

We are interested in graphs for which all the eigenvalues are small in absolute value, except {\lambda_1}, that is, if we define

\displaystyle  \sigma_2 := \max\{ |\lambda_2|, |\lambda_3 | , \ldots , | \lambda_n | \}

we are interested in graphs for which {\sigma_2} is small. The expander mixing lemma is the fact that for every two disjoint sets {S} and {T} of vertices we have

\displaystyle  \left | E(S,V-S) - \frac d n \cdot |S| \cdot |T| \right | \leq \sigma_2 d \sqrt{ |S| \cdot |T|} \ \ \ \ \ (1)

The inequality (1) says that, if {\sigma_2} is small, then the number of edges between every two large sets of vertices is almost determined just by the size of the sets, and it is equal to the expected number of edges between the two sets in a random {d}-regular graph, up to an error term that depends on {\sigma_2}.

For the proof, we observe that, if we call {J} the matrix that has ones everywhere, then

\displaystyle  \sigma_2 = \max_{x \perp (1,\ldots , 1) } \frac { |x^T A x|}{d\cdot x^Tx} = \max_{x} \frac { \left|x^T \left( A - \frac dn J \right) x \right|}{d\cdot x^T x} = \max_{x,y} \frac { \left|x^T \left( A - \frac dn J \right) y \right|}{d\cdot ||x|| \cdot ||y||}

and then we substitute {x := {\mathbf 1}_S} and {y:= {\mathbf 1}_T} in the above expression and do the calculations.

In the case of an irregular undirected graph {G=(V,E)}, we are going to consider the normalized adjacency matrix {M:= D^{-\frac 12} A D^{-\frac 12 }}, where {A} is the adjacency matrix of {G} and {D} is the diagonal matrix such that {D_{v,v} = d_v}, where {d_v} is the degree of {v}. As in the regular case, the eigenvalues of the normalized adjacency matrix satisfy

\displaystyle  1 = \lambda_1 \geq \lambda_2 \geq \cdots \lambda_n \geq -1

Let us define

\displaystyle  \sigma_2:= \max \{ |\lambda_2| , |\lambda_3| , \ldots, |\lambda_n| \}

the second largest eigenvalue in absolute value of {M}.

We will need two more definitions: for a set of vertices {S}, its volume is defined as

\displaystyle  vol(S):= \sum_{v\in S} d_v

the sum of the degrees and {\bar d = \frac 1n \sum_v d_v} the average degree, so that {vol(V) = \bar d n }. Now we have

Lemma 1 (Expander Mixing Lemma) For every two disjoint sets of vertices {S}, {T}, we have

\displaystyle   \left | E(S,V-S) - \frac {vol(S) \cdot vol(T) }{vol(V)} \right | \leq \sigma_2 \sqrt{vol(S) \cdot vol(T) } \ \ \ \ \ (2)

So, once again, we have that the number of edges between {S} and {T} is what one would expect in a random graph in which the edge {(u,v)} exists with probability {d_ud_v/vol(V)}, up to an error that depends on {\sigma_2}.

Read the rest of this entry »

The spectral norm of the infinite {d}-regular tree is {2 \sqrt {d-1}}. We will see what this means and how to prove it.

When talking about the expansion of random graphs, abobut the construction of Ramanujan expanders, as well as about sparsifiers, community detection, and several other problems, the number {2 \sqrt{d-1}} comes up often, where {d} is the degree of the graph, for reasons that tend to be related to properties of the infinite {d}-regular tree.

Read the rest of this entry »

A regular connected graph is Ramanujan if and only if its Ihara zeta function satisfies a Riemann hypothesis.

The purpose of this post is to explain all the words in the previous sentence, and to show the proof, except for the major step of proving a certain identity.

There are at least a couple of reasons why more computer scientists should know about this result. One is that it is nice to see a connection, even if just at a syntactic level, between analytic facts that imply that the primes are pseudorandom and analytic facts that imply that good expanders are pseudorandom (the connection is deeper in the case of the Ramanujan Cayley graphs constructed by Lubotzky, Phillips and Sarnak). The other is that the argument looks at eigenvalues of the adjacency matrix of a graph as roots of a characteristic polynomial, a view that is usually not very helpful in achieving quantitative result, with the important exception of the work of Marcus, Spielman and Srivastava on interlacing polynomials.

Read the rest of this entry »

In preparation for the special program on spectral graph theory at the Simons Institute, which starts in a week, I have been reading on the topics of the theory that I don’t know much about: the spectrum of random graphs, properties of highly expanding graphs, spectral sparsification, and so on.

I have been writing some notes for myself, and here is something that bothers me: How do you call the second largest, in absolute value, eigenvalue of the adjacency matrix of a graph, without resorting to the sentence I just wrote? And how do you denote it?

I have noticed that the typical answer to the first question is “second eigenvalue,” but this is a problem when it creates confusion with the actual second largest eigenvalue of the adjacency matrix, which could be a very different quantity. The answer to the second question seems to be either a noncommittal “{\lambda}” or a rather problematic “{\lambda_2}.”

For my own use, I have started to used the notation {\lambda_{2abs}}, which can certainly use some improvement, but I am still at a loss concerning terminology.

Perhaps one should start from where this number is coming from, and it seems that its important property is that, if the graph is {d} regular and has {n} vertices, and has adjacency matrix A, this number is the spectral norm of {A - \frac dn J} (where {J} is the matrix with ones everywhere), so that it measures the distance of {A} from the “perfect {d}-regular expander” in a norm that is useful to reason about cuts and also tractable to compute.

So, since it is the spectral norm of a modification of the adjacency matrix, how about calling it {<}adjective{>} spectral norm? I would vote for shifted spectral norm because I would think of subtracting {\frac dn J} as a sort of shift.

Please, do better in the comments!

This year, perhaps because of a mistake, the winners of the Field Medals and the Nevanlinna prize were made public before the opening ceremony of the ICM.

Congratulations to my former colleague Maryam Mirzakhani for being the first Fields Medals winner from Iran, a nation that can certainly use some good news, and a nation that has always done well in identifying and nurturing talent in mathematics and related fields. She is also the first woman to receive this award in 78 years.

And congratulations to Subhash Khot for a very well deserved Nevanlinna prize, and one can read about his work in his own words, in my words, and about the latest impact of his work in the the words of Barak and Steurer.

The Simons foundations has excellent articles up about their work and the work of Artur Avila, Manjul Bhargava, and Martin Hairer, the other Fields Medal recipient. An unusual thing about Manjul Bhargava’s work is that one can actually understand the statements of some of his results.

The New York Times has a fascinating article according to which the Fields Medal got its current status because of Steve Smale and cold war paranoia. I don’t know if they are overstating their case, but it is a great story.

A few weeks ago, the Proceedings of the National Academy of Science published an article on a study conducted by a group of Cornell researchers at Facebook. They picked about 600,000 users and then, for a week, a subset of them saw fewer “negative” posts (up to 90% were filtered) than they would otherwise see, a subset saw fewer “positive” posts (same), and a control group got a random subset.

After the week, the users in the “negative” group posted fewer, and more negative, posts, and those in the “positive” group posted more, and more positive, posts.

Posts were classified according to an algorithm called LIWC2007.

The study run contrary to a conventional wisdom that people find it depressing to see on Facebook good things happening to their friends.

The paper has caused considerable controversy for being a study with human subjects conducted without explicit consent. Every university, including of course Cornell, requires experiments involving people to be approved by a special committee, and participants must sign informed consent forms. Facebook maintains that the study is consistent with its terms of service. The highly respected privacy organization EPIC has filed a complaint with the FTC. (And they have been concerned with Facebook’s term of service for a long time.)

Here I would like to explore a different angle: almost everybody thinks that observational studies about human behavior can be done without informed consent. This means that if the Cornell scientists had run an analysis on old Facebook data, with no manipulation of the feed generation algorithm, there would not have been such a concern.

At the same time, the number of posts that are fit for the feed of a typical user vastly exceed what can fit in one screen, and so there are algorithms that pick a rather small subset of posts that are evaluated to be of higher relevance, according to some scoring function. Now suppose that, if N posts fit on the screen, the algorithm picks the 2N highest scoring posts, and then randomly picks half of them. This seems rather reasonable because the scoring function is going to be an approximation of relevance anyway.

The United States has roughly 130 million Facebook subscriber. Suppose that the typical user looks, in a week, at 200 posts, which seems reasonable (in our case, those would be a random subset of roughly 400 posts). According to the PNAS study, roughly 50% of the posts are positive and 25% are negative, so of the initial 400, roughly 200 are positive and 100 are negative. Let’s look at the 100,000 users for which the random sampling picked the fewest positive posts: we would be expecting roughly 3 standard deviations below the mean, so about 80 positive posts instead of the expected 100; the 100,000 users with the fewest negative posts would get about 35 instead of the expected 50.

This is much less variance than in the PNAS study, where they would have got, respectively, only 10 positive and only 5 negative, but it may have been enough to pick up a signal.

Apart from the calculations, which I probably got wrong anyway, what we have is that in the PNAS study they picked a subset of people and then they varied the distribution of posts, while in the second case you pick random posts for everybody and then you select the users with the most variance.

If you could arrange distributions so that the distributions of posts seen by each users are the same, would it really be correct to view one study as experimental and one as observational? If the PNAS study had filtered 20% instead of 90% of the positive/negative posts, would it have been ethical? Does it matter what is the intention when designing the randomized algorithm that selects posts? If Facebook were to introduce randomness in the scoring algorithm with the goal of later running observational studies would it be ethical? Would they need to let people opt out? I genuinely don’t know the answer to these questions, but I haven’t seen them discussed elsewhere.

I have always known that I don’t completely understand how fiat money works, however I have recently realized that I don’t understand it at all! Maybe some of my readers can clarify my confusion.

So let’s start a national economy from scratch, without international trade: so a group of people move to a deserted island, declare independence, the people have all kind of skills, they bring all kind of materials and machineries with them, the island has all kinds of natural resources and maybe a lottery system gives some people ownership of various plots of lands and mining rights.

Now some would-be entrepreneurs would like to begin hiring people with the right skills, buying or renting various stuff, and start some businesses. It seems that there is no loss of generality if I think of the entrepreneurs as just one person for the sake of what I want to think about. She is going to need to get a loan to start her business(es). Meanwhile, the new island government created a central bank, which issues theory dollars, or thollars, which are the currency of the island. The central bank “creates” thollars and lends them to banks, then the banks keep a fractional reserve and lend to the entrepreneur. Again, for the sake of what I want to say, there is no loss of generality if I identify the central bank and the other banks as just one entity.

So the central bank lends thollars to the entrepreneur, and she uses the money to start the business and being to pay people. At this point money starts circulating in the economy, and people will hire gardeners and babysitters, and lawyers, they will give to charities, they will hire computer science theory tutors for their kids, they will buy and sell houses to each others, and, crucially, will buy whatever goods and services the entrepreneur is selling. Now she is making a profit and she can pay back the loan to the central bank and invest in more … wait, she can never pay back the loan!

That’s because all the thollars in circulation are the ones that the central bank lent her, so there is no way that, as the money circulates, she can ever make more money that she owes!

Ok, so maybe people will also take loans to buy houses and stuff, so that’s more money that circulates, but is it really the case that, overall, it is impossible for everybody to be debt-free? That all the cash that any debt-free person has needs to be compensated by an equivalent amount of debt from other people? This is not how things seem to be in practice.

Now, clearly it is possible for everybody to have positive net worth, because, at the start, people have stuff and that stuff is worth money, but it seems strange that not everybody can be debt-free.

Maybe the problem is deflation? That if the entrepreneur borrowed money to create a business that creates new wealth (because it makes stuff that people find more valuable than the value of the raw material and the value of the work that went into it), but the amount of circulating money stays the same, then there is deflation, and her debt is spiraling out of control in deflation-adjusted terms, even if the interest rate is zero?

It seems that there are only two ways in which you can have everybody be debt-free and have a positive amount of cash: (i) the central bank starts buying stocks of private companies, (ii) the government runs a deficit, and the central bank buys government debt.

Is this correct? Normally, we think of people and companies being debt-free (or having more cash than debt) as ideal, and a government running no deficit as ideal, and usually central banks don’t buy stocks (they only buy bonds, which is formally equivalent to lending), so are these three conditions contradictory?

Where you least expect them:

a common [definiton] for “population” is a geographical cluster of people who mate more within the cluster than outside of it

a

Follow

Get every new post delivered to your Inbox.

Join 270 other followers