Today I would like to discuss the Khot-Naor approximation algorithm for the 3-XOR problem, and an open question related to it.

In 3-XOR, we have a system of linear equations modulo 2, with three variables per equation, that might look something like

The above system is not satisfiable (if we add up the left-hand sides we get 0, if we add up the right-hand sides we get 1), but it is possible to satisfy of the equations, for example by setting all the variables to 1. In Max 3-XOR problem (which we will simply refer to as “3-XOR” from now on), given a system of equations mod 2 with three variables per equation, we want to find an assignment that satisfies as many equations as possible.

Either setting all the variables to zero or setting all the variables to one will satisfy half of the equations, and the interesting question is how much better than 1/2 it is possible to do on a given instance. Khot and Naor provide an algorithm that, given an instance in which it is possible to satisfy an fraction of equations, finds a solution that satisfies at least fraction of equations, where is the number of variables. The algorithm is randomized, it runs in polynomial time, and it succeeds with high probability. I believe that it is still the state of the art in terms of worst-case approximation guarantee.

Like the approximation algorithm for sparsest cut in Abelian Cayley graphs implied by the result of Bauer et al. that was the subject of the last two posts, the result of Khot and Naor *does not* prove a bound on the integrality gap of a relaxation of the problem.

I will describe the Khot-Naor algorithm and describe how it manages to use convex optimization to provide an approximation algorithm, but without establishing an integrality gap bound. I thank my student Lucas Pesenti for explaining the algorithm to me and for thinking together about this problem.

If our 3-XOR instance has equations and variables, then the problem of maximizing the number of satisfied equations can be rewritten as

so that our goal is to approximate the combinatorial optimization problem

Up to a constant factor loss in the approximation guarantee, Khot and Naor show that the above is equivalent to

where is a symmetric 3-tensor with entries in and with non-zero entries.

Before continuing, let us recall that if is an matrix, then its -to- operator norm has the characterization

We could also define the “Grothendieck norm” of a matrix as the following natural semidefinite programming relaxation of the -to- norm:

where the and are arbitrary vectors. The Grothendieck inequality is

where the is an absolute constant, known to be less than . Furthermore, the above inequality has a constructive proof, and it leads to a polynomial time constant factor approximation for the problem of finding values and in that maximize (2).

Basically, we can see problem (1) as the natural generalization of (2) to tensors, and one would like to see a semidefinite programming relaxation of (1) achieving something resembling the Grothendieck inequality, but with a loss of something like . As I mentioned above, this remains an open question, as far as I know.

The idea of Khot and Naor is the following. Suppose that we are given an instance of problem (1), and suppose that is an optimal solution, and let us call

the value of the optimum (the algorithm will not need to know or guess ).

The key step is now to see that if we pick a *random* , there is at least a probability that

This is a bit difficult, but it is really easy to see that with probability we have

and we can do that by seeing that by defining a vector such that , so that

So we have

which, using Cauchy-Schwarz, gives

Now, for a random , we have

and

so by Paley–Zygmund we have, let’s say

which, together with the definition of and the fact that the distribution of is symmetric around zero, gives us the claim (3).

Now suppose that we have been lucky and that we have found such an . We define the matrix

and we see that our claim can be written as

At this point we just apply the algorithm implied by the Grothendieck inequality to the matrix , and we find and in such that

meaning that

Summarizing, our algorithm is to pick a random vector and to find a constant-factor approximation for the problem

using semidefinite programming. We do that times, and take the best solution.

The analysis can be turned into an upper bound certificate in the following way. For the (suboptimal) analysis using Paley-Zygmund, we only need the entries of the random to be 4-wise independent, and there are distributions on where the entries are unbiased and 4-wise independent, and such that the sample space is of size polynomial in . Thus, one could write an SDP relaxation of (4) for each in the support of such a distribution, and then take the maximum of these SDPs, multiply it by , and it would be a certified upper bound. Such an upper bound, however, would not come from a relaxation of the 3-XOR problem, and I find it really strange that it is not clear how to turn these ideas into a proof that, say, the standard degree-4 sum-of-squares semidefinite programming relaxation of 3-XOR has an integrality gap at most .

Luca, thanks for the post!

I think this paper (https://arxiv.org/pdf/1611.05998.pdf) does something related to what you are looking for.

They are working over the sphere rather than the hypercube.

But my recollection is that they show integrality gap upper bounds for the usual SoS relaxation.