This presentation is an adaptation of material from Antoni Lozano and Conrado Martinez
AVL Trees
Heaps & Priority Queues
Before we start: Questions on previous material?
A pair of sets (V,E) is a graph if
The elements of E are called edges.
For e=\{u,v\} the vertices u,v are the extremes of the edge e, they are adjacent and the edge e is incident to u,v.
Graphs are a very useful abstraction.
What could be some real world scenario that is useful to represent using a graph?
An undirected graph G=(V,E) can be drawn using for each vertex in V a “dot” (\bullet) and each edge e=\{u,v\}\in E as a continuous line joining the dots for u and v. Same for digraphs, but the line is an arrow \to.
A graph is planar if it can be embedded in the plane, i.e., it can be drawn on the plane in such a way that its edges intersect only at their endpoints.
Given a graph G=(V,E) usually we denote |V| as n and |E| as m. 0\leq m \leq \binom{n}{2}=\frac{n(n-1)}{2}\ .
A graph with m=\binom{n}{2} is complete (or a n-clique) and it is denoted with K_n.
When a graph has m=\Theta(n^2) it is called dense. Otherwise, it is sparse.
Is K_4 planar? And K_5?
Can planar graphs be dense?
Given an undirected graph G=(V,E) and v\in V, its degree is the number of its neighbors N(v):
\deg(v)=|N(v)|=|\{u\in V\colon \{u,v\}\in E\}|\ .
A graph is d-regular if for every vertex v, \deg(v)=d.
For directed graphs we distinguish between the in-degree and the out-degree
\deg_{in}(v)=|\{u\in V\colon (u,v)\in E\}|\ .
\deg_{out}(v)=|\{u\in V\colon (v,u)\in E\}|\ .
For every graph G=(V,E), \sum_{v\in V}\deg(v)=2|E|\ .
For every digraph \sum_{v\in V}\deg_{in}(v)=\sum_{v\in V}\deg_{out}(v)=|E|\ .
A d-regular graph is sparse or dense? (d is a constant)
A path in a graph from u to v is a sequence of vertices v_1,\dots,v_\ell such that v_1=u, v_\ell=v and \{v_i,v_{i+1}\}\in E for each i=1,\dots,\ell-1. (For a digraph the definition is analogous, the difference is that we require (v_i,v_{i+1})\in E)
A graph is connected if given any two vertices there is a path from one to the other. For digraphs, the definition is the same but we say the digraph is strongly connected.
A path is called simple if no vertex appears more than once in the sequence, except possibly v_1 and v_\ell. If v_1 = v_\ell the path is called a cycle.
All the vertices reachable by a path from a vertex v form the connected component containing v.
Theorem 1 A graph with n vertices and m edges has at least n-m connected components.
Theorem 2 The following conditions for a graph G with n vertices are all equivalent definitions of a tree:
A forest is a graph where each connected component is a tree.
In particular, if G is a tree, then m=n-1. If G=(V,E) is a forest with \ell connected components, then m=n-\ell.
We often consider graphs where the edges have weights. That is, we have a function w\colon E\to \mathbb R assigning to each edge its weight.
What could be some examples of graphs where the edges have naturally associated weights?
Which of the above are dense graphs? Which ones are planar?
A graph G=(V,E) with n vertices can be represented by a Boolean n\times n matrix A, i.e. A\in\{0,1\}^{n\times n}. A_{ij} = \begin{cases} 1 &\text{if } (i,j)\in E\\ 0 &\text{otherwise} \end{cases}
For weighted (di)graph A_{ij} stores the weight of the edge (i,j).
typedef vector<vector<bool>> graph;
If g
has type graph
(with n vertices) then
g
is \Theta(n^2)g
is \Theta(n^2)g[i]
contains the info on the neighbors of i
g[i][j]
and g[j][i]
are always the same (waste of space…)This is useful if the graph is dense or to compute properties of graphs that can be computed easily by linear algebra.
an array or vector T
such that for a vertex u:
T[u]
points to a list of the edges incident to u or a list of the vertices v\in V which are adjacent to u.
For digraphs, we will have the list of the successors of a vertex u, for all vertices u, and, possibly, the list of predecessors of a vertex u, for all vertices u.
typedef vector< vector<int>> graph;
If g
has type graph
(with n vertices) then
g
has space cost \Theta(n+m), i.e. linearAdjacency lists are adequate for sparse graphs. Usually this is the preferred choice.
for very regular graphs, e.g. the grid or a chessboard.
Adjacency matrices need space \Theta(n^2) to represent the graph with n vertices, regardless of the number of edges in the graph.
As a general rule, the preferred implementation will be adjancecy lists. They require space \Theta(n+ m): linear in the size of the graph.
Adjacency matrices are fine to represent dense graphs or when we need to answer very efficiently whether an edge (u,v) \in E or not.
typedef int vertex;
typedef pair<vertex, vertex> edge;
class Graph {
public:
Graph();
Graph(int n);
void add_vertex();
void add_edge(vertex u, vertex v);
void add_edge(edge e);
int nr_vertices() const;
int nr_edges() const;
list<edge> adjacent(vertex u) const;
...
private:
int n, m;
vector<list<edge>> T;
// alternatives: vector<list<vertex>> T;
// vector<vector<vertex>> T;
...
}
n
where n is the number of vertices in the graph, before adding the new vertex
In a Depth-First Search (DFS) (cat: recorregut en profunditat, esp: recorrido en profundidad) of a graph G we visit a vertex v and from there we traverse, recursively, each non-visited vertex w which is adjacent/successor of v.
To avoid visiting multiple times the same vertices we mark the vertices already visited.
list<int> dfs_recursive (const graph& G) {
int n = G.size();
list<int> L;
vector<boolean> visited(n, false);
for(int u = 0; u < n; ++u) {
dfs_recursive(G, u, visited, L);
}
return L;
}
void dfs_recursive (const graph& G, int u, vector<boolean>& vis, list<int>& L) {
if (not vis[u]) {
vis[u] = true;
L.push_back(u);
for (int v : G[u]) {
dfs_recursive(G, v, vis, L);
}
}
}
for
loop assures we visit the whole graph even if it is not connected.
dfs_recursive
is \mathcal O(n+m): each vertex is marked only once (in total \mathcal O(n) cost), each edge i,j is visited twice: from i and from j (in total cost \mathcal O(m))
A call to dfs_rec(g,v,...)
visits the connected component of v and induces a spanning tree of that component. All edges in the connected component can be thus classified in two types:
A full DFS of a graph induces a spanning forest.
A directed acyclic graph (DAG) is a digraph that contains no cycles (as its name indicates).
DAGs have many applications since they model well requirements and precedence.
For instance, in a large complex software system, each node is a subsystem and each edge (A,B) indicates that subsystem A uses subsystem B. Vertices with no predecessors are called roots; vertices with no successors are called leaves of the DAG.
A topological ordering of a DAG G is a sequence of all its vertices such that for any vertex w, if v precedes w in the sequence then (w,v)\notin E, i.e. no vertex is visited until all its predecessors have been visited.
Suppose that we have pred[v]
, the number of predecessors of v in G that have not yet been visited.
Then if any vertex v has pred[v]
= 0 we can visit it and decrement by one pred[w]
, for all its successors w
. The algorithm has thus two steps:
pred[v]
for all vertices v\in Gpred[w]
for all its successors.L ← Empty list that will contain the sorted elements
S ← Empty set of nodes that will contain the ones with no incoming edges
for each vertex v
set pred[v] to 0
for each vertex v
for each edge (v,w)
increment pred[w] by 1
for each vertex v
if pred[v] equals 0 then add v to S
while(S is not empty)
remove a node v from S
add v to L
for each edge e=(v,w) {
decrement pred[w] by 1
if pred[w] is 0 then
insert w into S
return L
See the algorithm idea in action
What is the cost of this approach?
Questions?