20090923


We're going to prove that a problem is NP-Complete by using a reduction from CNF-SAT.

The problem we will address is the k-Clique Problem:  Given a graph G and an integer k, does G contain a set of k vertices that are all mutually adjacent (connected by edges)?

First we observe that k-Clique is in NP:  if the answer to an instance of k-Clique is YES and we know the details of the answer, we can easily verify that the required edges are present in the graph.

To show that CNF-SAT reduces to k-Clique, we start with an arbitrary instance of CNF-SAT: a boolean expression E in CNF.

We need to construct an instance of k-Clique (i.e. a graph G and an integer k) in such a way that G has a k-clique iff E is satisfiable.

Note that even though we cannot put any restrictions on E, we can use all details of  E when we create G and k.  

To create the vertex set of G, we create a group of  vertices for each clause of E (call these groups clause-groups because each one corresponds to a specific clause of E), and label each vertex with one of the literals from the corresponding clause.  This may result in many vertices getting the same label - this is ok.

To create the edge set of G, we add an edge between every pair of vertices except for
    - vertices in the same clause-group
    - vertices whose labels are direct negations of each other (for example, a vertex labelled "x1" and another vertex labelled "not x1")

Clearly G can be constructed in O(n^2) time, where n is the length of E.

To complete the constructed instance of k-Clique we need the integer k.  Note that any clique in G can contain at most one vertex from each of the clause-groups, so the number of clauses in E is an upper bound on the size of the largest clique in G.  It turns out this is exactly the size of clique we want to find, so we let k = the number of clauses in E.

So far we have shown that we can transform any instance of CNF-SAT into an instance of k-Clique in polynomial time.  Now we have to show that the reduction is answer-preserving.

We will show that the answer to the instance of CNF-SAT is YES iff the answer to the instance of k-Clique is YES.  (We could also show "NO iff NO", or "CNF-SAT YES implies k-Clique YES, CNF-SAT NO implies k-Clique NO" or "k-Clique YES implies CNF-SAT YES, k-Clique NO implies CNF-SAT NO").

First direction.  Suppose the CNF-SAT answer is YES - i.e. suppose E is satisfiable.  Consider any assignment of True and False to the literals in E that makes E true. There must be at least one literal in each clause that is true.  For each clause, choose a literal in that clause that is true, and select the corresponding vertex in G.  Claim: these vertices must form a k-clique.  First, there are certainly k of them.  Suppose that some pair of these vertices are not adjacent.  That could only happen if they were in the same clause-group (they aren't) or their labels negated each other.  But if their labels negated each other, that would mean some literal was both true and false in the truth assignment that satisfies E - which can't happen.  Thus all of the selected vertices are adjacent to each other, and they form a k-clique.

Second direction.  Suppose G contains a k-clique.  Due to the way we constructed G, the k-clique must contain one vertex from each clause-group.  For each of the vertices in the k-clique, select the corresponding literal in the corresponding clause of E.  Set all these literals to be True.  There is no danger of attempting to assign True to a variable and to its negation, because the k-clique in G cannot contain any edges between things that negate each other.  Now we have one True literal in each clause.  Assign True and False to any unassigned literals in any consistent way (ie don't attempt to assign both True and False to the same variable) .  These assignments are effectively irrelevant because we already have one True literal in each clause of E, which means that E is satisfiable.

Thus our polynomial time reduction from CNF-SAT to k-Clique is answer preserving.  Therefore k-Clique is NP-Complete.



Note that when we said "suppose E is satisfiable", we never said anything about how we obtain the truth assignment that we then use to select vertices in G.  Similarly when we said "Suppose G contains a k-clique" we never said anything about how we find the k-clique that lets us choose literals in E.  There is no suggestion in this proof that either of these problems can be solved in polynomial time (in fact, the implication is that they cannot).  What the proof shows is that IF the answer to either instance is YES, then the answer to the other instance is also YES.  

We are still no closer to solving either of these problems but now that we know they are both NP-Complete, we know that finding a polynomial-time algorithm for either one would also solve the other one ... and in practice, we conclude that k-Clique is almost certainly not solvable in polynomial time.