L4: Graph Algorithms II

In this lab we consider the remaining the problems from Jutge.org, section EDA \rightarrow Graph Algorithms. The problems in this section we see in this lab of two different types

variations on Dijkstra’s algorithm,
variations on Prim’s algorithm.

Dijkstra’s Algorithm

Given a weighted directed graph G= (V,E), Dijkstra’s algorithm (1959) finds all shortest paths from a given vertex (the source) to all other vertices in the directed graph. The weights on the edges are given by a function w\colon E\to \mathbb R^+.

Given two vertices u,v in G, their distance¹ d_G(u,v) in the weighted directed graph G is the minimum weight of a path from u to v, where the weight of a path is taken as the sum of the weights of the edges in it.

Why do we assume the weights to be non-negative?

If weights were allowed to be negative then the directed graph can contain negative weight cycles, and then shortest paths are not well defined.

Dijkstra’s algorithm works iteratively. The input is a weighted directed graph G=(V,E) and a vertex s. The algorithm will return a vector D, where D[u] is the minimum distance of u from the starting point s.

On each iteration the set of vertices is partitioned in two parts: the vertices that we have visited and those that haven’t been visited yet, also called candidates. The invariant of the loop establishes that

For all visited vertices u, D[u] is the minimum weight of a path from s to u,
For all candidate vertices u, D[u] is the weight of the shortest path from s to u which passes exclusively through visited vertices, i.e., only the last vertex in the path, u, is a candidate.

Each iteration of the main loop selects a candidate vertex u, updates D and marks u as visited. When all vertices have been visited, we have the desired answer.

Which candidate vertex do we need to select on each iteration of Dijkstra’s algorithm?

Intuitively, the candidate to choose is the one with minimal D. Let u be that vertex. According to the invariant, D[u] is minimal among all paths passing only through visited vertices. But it must also be the minimal weight for all paths. If there were a shortest path going through a candidate vertex x, the weight of that path would be D[x] + d_G(x,u) < D[u]. Since d_G(x,u) \geq 0 we would have D[x] < D[u], and that’s a contradiction since by assumption D[u] is minimal among all candidate vertices.

As we have just seen D[v] is the minimum weight of a path from s to v, for all visited vertices v, and for the vertex u to be visited.

We need now to consider how to maintain the second part of the invariant. Consider a candidate vertex v. Since u becomes a visited vertex in the current iteration, we may have a new shortest path from s to v going only through visited vertices that now include u. A simple reasoning shows that if such shortest path includes u then u must be the immediate predecessor of v in the path (assuming the contrary leads to a contradiction).

It follows that D[v] can change if and only if v is a successor of u; in particular, if D[v] >D[u] + w\ , where w is the weight of the edge (u,v), then D[v] must be updated D[v] := D[u] + w. The condition above is never true if v has already been visited.

See an animation of a run of Dijkstra’s algorithm!

Or also this animation from wikipedia:

Cost of Dijkstra’s Algorithm

Let n= |V(G)| and m= |E(G)|, as usual. Assume that the digraph is implemented with an adjacency matrix. Then the cost of the algorithm is Ω\Omega(n^2) since there are n iterations, and we incur a cost \Omega(n) to go through the successors v of the chosen candidate (because we are scanning the corresponding row in the matrix). Finding the minimum D at each iteration is O(n) and therefore the cost of the algorithm is actually \Theta(n^2).

If the graph is implemented with adjancecy lists. Then the cost of the internal loop that goes through the successors of u is \Theta(\mathrm{deg}(u)\cdot \text{cost of updating }D).

If the set of candidates and the table D are implemented as simple arrays ( cand[i] = true if and only if i is a candidate) then the cost of the algorithm is \Theta(n^2 + m) = \Theta(n^2), since we have cost \Theta(i+ \mathrm{deg}(u)) during the iteration in which we have still i candidates and u is selected to be visited.

To improve the efficiency we need to convert cand/D to a priority queue; the elements are the candidate vertices and the priority of candidate v is its D[v]. This will give (see theory slides) a worst case cost of \Theta((n+m)\log n).

Jutge exercises

The Jutge exercises that can be thought as variations on Dijkstra’s algorithm are the following:

Suggested workflow

Once you have chosen a problem to solve, download the .zip archive of the problem and unzip it. In the obtained folder there are:

a file .pdf with the official problem statement
one or more text files with examples of input sample.inp
the corresponding correct outputs sample.cor

Step 1: Getting the C++ code to compile

After creating a C++ source file to solve the problem, compile it with g++ -std=c++11.

Common error(s)

Not using the flag -std=c++11.

Possible useful flags are -Wall -Wextra -Wno-overflow -Wpedantic -Werror -D_GLIBCXX_DEBUG

Step 2: Testing testing testing

Test the executable file obtained on the given example(s), for instance with ./a.out < sample.inp > output. Also, test your code against simple/trivial inputs, e.g. a graph with no-edges.

Common error(s)

Writing the input by hand.
Not checking the exact formatting of the output.

To see whether the output is the same as sample.cor a way is to use diff output sample.cor

Step 3: Submitting the solution to the Jutge

Once the code compiles and on the given samples it gives the correct output submit it to the Jutge for automatic evaluation.

Common error(s)

Sending to the Jutge code that was not even compiled locally.
Not checking if the output was correct on the given inputs.

Step 4: I’m stuck. What can I do?

You are “stuck” on a problem if you spent around 1 hour trying to solve it but still the Jutge is rejecting your solution.

Possible approaches to getting “un-stuck” are:

do not think about that problem for some time and come back to your code with fresh eyes
look again at the theory
try to explain your code to another student
ask for hints, e.g. by email. Be specific on the type of help you need. Include in the email the code of your approach.

From this block of exercises, in class we solve together the ones marked with 🔶. They are the same as the one listed below.²

Weighted shortest path (1)

Minimum Spanning Trees

Given a weighted undirected connected graph G = (V,E), with weights w\colon E \to\mathbb R, a minimum spanning tree (MST) of G is a spanning tree T and its total weight w(T) = \sum_{e\in E(T)} w(e) is minimum among all possible spanning trees of G.

Jarník-Prim algorithm is a greedy algorithm that given an input a weighted undirected connected graph finds a minimum spanning tree in it. In this algorithm we maintain a subset of visited vertices; each iteration of the algorithm selects the edge of minimum weight that joins a visited vertex u with a non-visited vertex v. Such edge cannot create a cycle, and it is added to the current set of edges for the spanning tree; furthermore vertex v is marked as visited. The set of edges so far is a spanning tree of the visited vertices. Since at each step we select the edge of minimum weight crossing the visited and non-visited vertices at the end the resulting set of edges will be a tree of minimum weight.

See an animation of a run of Jarník-Prim’s algorithm!

A possible C++ implementation of Jarník-Prim algorithm is the following:

#include<vector>
#include<queue>

using edge = pair<int,int>;
using wedge = pair<double,edge>;//weighted edge
using graph = vector<list<weighted_edge>>;

void Prim(const graph& G, list<edge>& MST, double& MST_weight) {
  int n = G.size();
  vector<bool> visited(n, false);
  priority_queue<wedge,vector<wedge>,greater<wedge>> Candidates;

  visited[0] = true;
  MST_weight = 0.0;

  for (wedge e : G[0]) Candidates.push(e);

  while (MST.size() < n-1) {
    int weight = Candidates.top().first;
    edge uv = Candidates.top().second;
    Candidates.pop();
    int v = uv.second;
    if (not visited[v]) {
      visited[v] = true;
      MST.push_back(uv); 
      MST_weight += weight;
      for (wedge e : G[e]) Candidates.push(e);
    }
  }
}

Cost of Jarník-Prim’s Algorithm

The cost of Jarník-Prim’s algorithm depends how do we implement the selection of the minimum weight edge. If we use a priority queue, the selection of the edge to add at each step can be done in time \Theta(\log m) = \Theta(\log n). We incur an additional \Theta(1) for the visit of a vertex on each iteration. The internal loop through the adjacent vertices, adding new candidate edges and removing edges that connect the just visited vertex with previously visited vertices has cost \Theta(\mathrm{deg}(v)\cdot \log m) = \Theta(\mathrm{deg}(v)\cdot \log n) when the currently visited vertex is v.

If we add the contributions of the costs for all vertices v (the main loop visits one vertex on each iteration) we get \Theta(m\log n).

Jutge exercises

The Jutge exercises that can be thought as variations on Jarník-Prim’s algorithm are the following:

P12887 Minimum spanning trees

Remember that for any doubt on the exercises you can ask me by email too until 6 days before the lab exam.

Footnotes

This is not really a distance because in general it is not symmetric.↩︎
Recall that ❗ marks possibly harder problems.↩︎