This presentation is an adaptation of material from Antoni Lozano and Conrado Martinez
Before we start: Questions on previous material?
In a Breadth-First Search (BFS) (cat: recorregut en amplada, esp: recorrido en anchura) of a graph G
Note
If d= the distance to s of the last visited vertex,
#include <queue>
list<int> bfs (const graph& G) {
int n = G.size();
list<int> L;
queue<int> Q;
vector<boolean> vis(n, false);
for (int u = 0; u < n; ++u) {
Q.push(u);
while (not Q.empty()) {
int v = Q.front();
Q.pop();
if (not vis[v]) {
vis[v] = true;
L.push_back(v);
for (int w : G[v]) {
Q.push(w);
}
}
}
}
return L;
}
bfs
return the list of visited vertices
bfs
is the same as an iterative depth-first search, but using a queue instead of a stack (see last class)
Q
exactly onceHence the total cost of BFS is
\begin{align*} \text{BFS-cost}&=\sum_v(\Theta(1)+\Theta(\deg(v))) \\ &=\Theta(|V|)+\Theta(|E|) \end{align*}
Let G be a connected graph
The diameter of G is the maximal distance between a pair of vertices.
The center of G is the vertex such that the maximal distance to any other vertex is minimal.
We can compute the diameter using a BFS starting from each vertex in G. The cost is \Theta(n(n+m)) = \Theta(nm), since the graph is connected m\geq n−1.
Many graphs/networks arising in many areas, specially in social sciences are large, but their diameter is small. \to 6-degrees of separation
A weighted directed graph G= (V,E) is a directed graph where the weights are given by a function w\colon E\to \mathbb R.
We assume the weights to be always positive.
Examples of weighted graphs?
Given two vertices u,v in G, their distance1 d_G(u,v) in the weighted directed graph G is the minimum weight of a path from u to v, where the weight of a path is taken as the sum of the weights of the edges in it.
Why do we assume the weights to be non-negative?
If weights were allowed to be negative then the directed graph can contain negative weight cycles, and then shortest paths are not well defined.
Dijkstra’s algorithm (1959) solves this problem finding all shortest paths from s to all other vertices in G.
Dijkstra’s algorithm works iteratively.
The input is a weighted directed graph G=(V,E) and a vertex s.
For the moment we focus on returning just the minimum distance from s to any other vertex.
The algorithm will return a vector D, where D[u] is the minimum distance of u from the starting point s.
On each iteration the set of vertices is partitioned in two parts:
We want to maintain the invariant that
At each iteration we
When all vertices have been visited, we have the desired answer.
Which candidate vertex do we select on each iteration?
Choosing as candidate u with minimal D[u] allows to maintain the invariant.
Let u be a candidate vertex with minimal D[u]. By the invariant, D[u] is minimal among all paths from s to u passing only through visited vertices.
Now we mark u as visited, so D[u] must be the smallest also among all paths from s to u.
If there were a shortest path from s to u going through a candidate vertex x, the weight of that path would be D[x] + d_G(x,u) < D[u]. Since d_G(x,u) \geq 0 we would have D[x] < D[u], and that’s a contradiction since by assumption D[u] is minimal among all candidate vertices.
We need now to consider how to maintain the second part of the invariant. Consider a candidate vertex v. Since u becomes a visited vertex in the current iteration, we may have a new shortest path from s to v going only through visited vertices that now include u. A simple reasoning shows that if such shortest path includes u then u must be the immediate predecessor of v in the path (assuming the contrary leads to a contradiction).
It follows that D[v] can change if and only if v is a successor of u; in particular, if D[v] >D[u] + w\ , where w is the weight of the edge (u,v), then D[v] must be updated D[v] := D[u] + w. The condition above is never true if v has already been visited.To compute the shortest paths (not just their weights) we can use the following key observation:
We can hence compute an (implicit) shortest paths tree.
#include<queue>
#include<limits.h>
using vertex = int;
using graph = vector<vector<pair<vertex,double>>>;
using P = pair<double,vertex>;
using Pvector = pair<vector<int>,vector<int>>
Pvector dijkstra(const graph& G, vertex s) {
int n = G.size();
d = vector<double>(n, INT_MAX);
d[s] = 0;
p = vector<int>(n, -1);
vector<boolean> visited(n, false);
priority_queue<P, vector<P>, greater<P>> Q;
Q.push({0, s});
while (not Q.empty()) {
int u = Q.top().second;
Q.pop();
if (not visited[u]) {
visited[u] = true;
for (auto a : G[u]) {
vertex v = a.first;
double c = a.second;
if (d[v] > d[u] + c) {
d[v] = d[u] + c;
p[v] = u;
Q.push({d[v], v});
}
}
}
}
return {d,p};
}
d
stores all the distances from s
p
stores implicitly a shortest path tree
P
the weight comes first?priority_queue<P> Q;
?Q.push({d[v],v});
the element {k,v}
(for some k) was already in the priority queue, why there is no need to remove it?If all operations in the priority queue have cost \mathcal O(\log n), then the cost of dijkstra
is
\begin{align*} \text{cost}(n,m)&=\sum_{v\in V}(\mathcal O(\log n) + (1+\textsf{out-degree}(v))) \\ &=\mathcal O(n\log n)+\mathcal O(\log n)\sum_{v\in V}\textsf{out-degree}(v)\\ &=\mathcal O(n\log n)+\mathcal O(m\log n)=\mathcal O((n+m)\log n)\ . \end{align*}
The cost of the initializations is \mathcal O(n\log n). Then the total cost is \mathcal O((n+m)\log n).
The worst-case is actually \Theta((n+m)\log n) but the cost in practice is usually smaller.
Given a weighted undirected connected graph G= (V,E), with weights w: E \to \mathbb R, a minimum spanning tree (MST) (cat: arbre d’expansió mínim) of G is a spanning tree T with total weight w(T)=\sum_{e\in E(T)}w(e) minimum among all possible spanning trees of G.
There are many different algorithms to compute MSTs.
All them follow the following greedy scheme.
A \gets \emptyset \qquad(A are the edges in the tree so far)
C \gets E \qquad(C are the candidates edges to add to the tree)
while |A|\neq |V(G)|−1 do
\qquad Select an edge e\in C that does not close a cycle in T=(V,A)
\qquad A\gets A\cup\{e\}
\qquad C \gets C - \{e\}
end while
Definition 1 (Promising set of edges) A subset of edges A\subseteq E(G) is promising if and only if
Definition 2 (Cut in a Graph) A cut in a graph G is a partition of the set of vertices V in the two subsets C, and its complement V-\{C\}. An edge e respects a cut if both ends of e belong to the same set, otherwise, we say that e crosses the cut.
Theorem 1 (MST Theorem) Let A be a promising set of edges that respects some cut of G. Let e be an edge of minimum weight among those crossing the cut. Then A\cup\{e\} is promising.
Let A' be the set of edges of some MST T' such that A\subseteq A'(this is well defined, as A is assumed to be promising). As A respects the cut there must exist at least one edge e'\in A' crossing the cut, otherwise T' would not be connected.
If e' is of minimum weight then A\cup\{e'\} is promising and we are done.
Otherwise, if to T we add the edge e we get a cycle crossing the cut in e and say e''. Exchanging in T e'' for e we get a new spanning tree of smaller weight. Hence also a MST.The MST theorem gives us a recipe to design MST algorithms:
Since A respects the cut and e crosses it, the addition of e to A cannot create a cycle. Any algorithm following the scheme above is automatically guaranteed correct.
Jarník-Prim algorithm is a greedy algorithm that given an input a weighted undirected connected graph finds a minimum spanning tree in it.
In this algorithm we maintain a subset of visited vertices;
Such edge cannot create a cycle, and it is added to the current set of edges for the spanning tree; furthermore vertex v is marked as visited.
#include<vector>
#include<queue>
using edge = pair<int,int>;
using wedge = pair<double,edge>;//weighted edge
using graph = vector<list<wedge>>;
void Prim(const graph& G, list<edge>& MST, double& MST_weight) {
int n = G.size();
vector<bool> visited(n, false);
priority_queue<wedge,vector<wedge>,greater<wedge>> Candidates;
visited[0] = true;
MST_weight = 0.0;
for (wedge e : G[0]) Candidates.push(e);
while (MST.size() < n-1) {
int weight = Candidates.top().first;
edge uv = Candidates.top().second;
Candidates.pop();
int v = uv.second;
if (not visited[v]) {
visited[v] = true;
MST.push_back(uv);
MST_weight += weight;
for (wedge e : G[e]) Candidates.push(e);
}
}
}
The cost of Jarník-Prim’s algorithm depends how do we implement the selection of the minimum weight edge. If we use a priority queue we get as total cost \Theta(m\log n).
April 3rd: partial exam on the following topics
Questions?