20090923
We're going to prove that a problem is NP-Complete by using a reduction from CNF-SAT.
The problem we will address is the k-Clique Problem:
Given a graph G and an integer k, does G contain a set of k
vertices that are all mutually adjacent (connected by edges)?
First
we observe that k-Clique is in NP: if the answer to an instance
of k-Clique is YES and we know the details of the answer, we can easily
verify that the required edges are present in the graph.
To show
that CNF-SAT reduces to k-Clique, we start with an arbitrary instance of CNF-SAT: a boolean expression E in CNF.
We
need to construct an instance of k-Clique (i.e. a graph G and an
integer k) in such a way that G has a k-clique iff E is satisfiable.
Note that even though we cannot put any restrictions on E, we can use all details of E when we create G and k.
To create the vertex set of G, we create a group of vertices for each clause of E (call these groups clause-groups
because each one corresponds to a specific clause of E), and label each
vertex with one of the literals from the corresponding clause.
This may result in many vertices getting the same label - this is
ok.
To create the edge set of G, we add an edge between every pair of vertices except for
- vertices in the same clause-group
- vertices whose labels are direct negations of each other (for
example, a vertex labelled "x1" and another vertex labelled "not x1")
Clearly G can be constructed in O(n^2) time, where n is the length of E.
To
complete the constructed instance of k-Clique we need the integer k.
Note that any clique in G can contain at most one vertex from
each of the clause-groups, so the number of clauses in E is an upper
bound on the size of the largest clique in G. It turns out this
is exactly the size of clique we want to find, so we let k = the number
of clauses in E.
So far we have shown that we can transform any
instance of CNF-SAT into an instance of k-Clique in polynomial time.
Now we have to show that the reduction is answer-preserving.
We
will show that the answer to the instance of CNF-SAT is YES iff the
answer to the instance of k-Clique is YES. (We could also show
"NO iff NO", or "CNF-SAT YES implies k-Clique YES, CNF-SAT NO implies
k-Clique NO" or "k-Clique YES implies CNF-SAT YES, k-Clique NO implies CNF-SAT NO").
First direction. Suppose the CNF-SAT answer is
YES - i.e. suppose E is satisfiable. Consider any assignment of
True and False to the literals in E that makes E true. There must be at
least one literal in each clause that is true. For each clause,
choose a literal in that clause that is true, and select the
corresponding vertex in G. Claim: these vertices must form a
k-clique. First, there are certainly k of them. Suppose
that some pair of these vertices are not adjacent. That could
only happen if they were in the same clause-group (they aren't) or
their labels negated each other. But if their labels negated each
other, that would mean some literal was both true and false in the
truth assignment that satisfies E - which can't happen. Thus all
of the selected vertices are adjacent to each other, and they form a
k-clique.
Second direction. Suppose G contains a k-clique.
Due to the way we constructed G, the k-clique must contain one
vertex from each clause-group. For each of the vertices in the
k-clique, select the corresponding literal in the corresponding clause
of E. Set all these literals to be True. There is no danger
of attempting to assign True to a variable and to its negation, because
the k-clique in G cannot contain any edges between things that negate
each other. Now we have one True literal in each clause.
Assign True and False to any unassigned literals in any
consistent way (ie don't attempt to assign both True and False to the
same variable) . These assignments are effectively irrelevant
because we already have one True literal in each clause of E, which
means that E is satisfiable.
Thus our polynomial time reduction from CNF-SAT to k-Clique is answer preserving. Therefore k-Clique is NP-Complete.
Note
that when we said "suppose E is satisfiable", we never said anything
about how we obtain the truth assignment that we then use to select
vertices in G. Similarly when we said "Suppose G contains a
k-clique" we never said anything about how we find the k-clique that
lets us choose literals in E. There is no suggestion in this
proof that either of these problems can be solved in polynomial time
(in fact, the implication is that they cannot). What the proof
shows is that IF the answer to either instance is YES, then the answer
to the other instance is also YES.
We are still no closer
to solving either of these problems but now that we know they are both
NP-Complete, we know that finding a polynomial-time algorithm for
either one would also solve the other one ... and in practice, we
conclude that k-Clique is almost certainly not solvable in polynomial
time.