Approximation Algorithms
The Vertex Cover Problem
Given
a graph G on n vertices and m edges, we want to find the smallest
set A of vertices such that each edge of G has at least one end in the
set A. (This has applications in network monitoring and
maintenance.)
This problem is NP-Complete (to be more precise,
it is the related decision problem that is NP-complete but the decision
problem and the optimization problem are so closely related that we can
treat them as the same problem) so we should not expect to find a
polynomial-time algorithm that will correctly solve all instances.
The question to be addressed is whether we can find a
polynomial-time algorithm which always gets reasonably close to the
optimal answer.
We might consider approaching this problem with a greedy method, such as
repeat
choose a vertex with the largest number of incident edges
add this vertext to A
remove from G all the edges incident on the vertex just chosen
until G has no remaining edges
Exercise 35.1-3 of the text asks you to determine if this algorithm always finds a good solution.
We will consider an algorithm which initially seems too simple-minded to be likely to give a good result:
repeat
arbitrarily select an edge of G
add both ends of the selected edge to A
remove from G all edges incident on either of the two vertices just added to A
until G has no remaining edges
There are some obvious questions to ask about this algorithm
1. Does it run in polynomial time?
The repeat loop executes at most O(n^2) times, and
each iteration of the loop requires no more work than a traversal of
the list of the edges ... this clearly takes polynomial time.
2. Does it find a vertex cover?
The edges that are removed on each iteration are all
covered by the selected vertices. By repeating the loop until all
edges are removed from G, we ensure that all edges of G are covered by
the selected set of vertices.
3. Does it find a good vertex cover?
This is the interesting question, and it
turns out to be pretty straightforward to answer.
Let's let C*
be an optimal solution to this problem. A is the set of vertices
chosen by the algorithm. Let B be the set of edges chosen during
the execution of the algorithm. (Note that B is not the solution returned by the algorithm. That solution is A. The set B is useful for our analysis of the algorithm.)
Since
C* is a vertex cover, we know it covers the edges in B.
Furthermore, since none of the edges in B share endpoints, it is
clear that no vertex in C* can cover more than one edge in B.
Thus for each edge in B, we can find a vertex in C* that covers
that edge and no other edge in B. This leads to the conclusion
that |C*| >= |B|
The significance of this is that the size of
B is a lower bound on the size of the optimal solution. Note that
because the algorithm chooses edges arbitrarily, we have no way of
predicting the size of B - and it may be different if we run the
algorithm twice on the same graph. However, as we have seen, the
size of B will never exceed the size of the optimal solution.
Now
we will relate the size of B to the size of A. This is very easy
- for each edge in B, two vertices are added to A. So |A| = 2|B|
Putting these two pieces together, we see that |A| <= 2|C*|
This
is exactly where we want to be: we have shown that no matter how the
algorithm chooses its edges, the size of A will never be more than
twice the size of the optimal solution.