Spectral Graph Theory

Biologists use microscopes to study cells and microorganisms. Astronomers use telescopes to study the stars. But what does a mathematician use to study networks?

There are many tools and "instruments" for studying networks, but one of the most powerful is spectral graph theory.

Although you may not have heard of "spectral graph theory", you already know what graphs are, and I've mentioned before what the spectrum of a matrix is: it's the set of eigenvalues of the matrix.

Put together, spectral graph theory just means the study of graphs by inspecting the spectrum of matrices associated with graphs. Many properties of a graph or network can be seen just by observing the spectrum of its matrices.

Over the next few weeks, you'll be introduced to 4 different matrices associated with a graph:

  • The adjacency matrix $A$
  • The degree matrix $D$
  • The random walk matrix $W$
  • The Laplacian matrix $L$

You've already seen the definition of the adjacency matrix. This week, we'll learn about the degree matrix and the random walk matrix.

Some notation

Let $G$ be an graph with $n$ vertices $\{1, 2, \dots, n\}$. For simplicity, we'll assume that $G$ is undirected, and that there is at most one edge between two vertices. The notation $$i \sim j$$ means that vertex $i$ and vertex $j$ are adjacent (i.e. directly connected by an edge). If $i$ and $j$ are adjacent, we also say that they are neighbours.

Since $G$ is undirected, if $i \sim j$, then we also have $j \sim i$.

The adjacency and degree matrices

We've already seen one example of a matrix associated with it: its adjacency matrix $A$. If $G$ has $n$ vertices, then $A$ is an $n \times n$ matrix with entries

$$ a_{ij} = \begin{cases} 1 \quad \text{if } i \sim j \\ 0 \quad \text{otherwise.} \end{cases} $$

The degree of a vertex $i$, written $d(i)$, is the number of neighbours that $i$ has.

Another simple matrix is the degree matrix $D$, which is the diagonal matrix with the entries $d(1), d(2), \dots, d(n)$ along the diagonal.

Question 1

Consider the pictures of the graphs $G$, $H$ and $K$ below (you may ignore the code):

In [1]:
import networkx as nx
from matplotlib import pyplot as plt

G = nx.Graph([(1,2),(1,3),(2,4),(1,5), (2,3)])
H = nx.cycle_graph([1,2,3,4,5])
K = nx.complete_graph([1,2,3,4,5])

fig, axes = plt.subplots(ncols=3, figsize=(12,4))
axes[0].set_title('Graph G')
axes[1].set_title('Graph H')
axes[2].set_title('Graph K')
nx.draw(G,  with_labels = True, node_color = 'white', ax=axes[0])
nx.draw(H,  with_labels = True, node_color = 'white', ax=axes[1])
nx.draw(K,  with_labels = True, node_color = 'white', ax=axes[2])

a) Write out their adjacency and degree matrices.

For parts (b) and (c), you may use the examples $G,H,K$ to guide you. But your answer should apply to any graph on $n$ vertices.

b) Come up with a formula for $d(i)$ in terms of the $a_{ij}$s.

c) Let $d = \begin{pmatrix} d(1) \\ d(2) \\ \vdots \\ d(n) \end{pmatrix}$. Then $d$ is given by $Ax = d$ for some vector $x$. What is $x$? (Hint: use your formula in part (b))

The random walk matrix

Now suppose you're a smart ant on graph $G$, walking around from node to node along the edges.

Starting at a node, at every time step you randomly pick one of the edges (connected to that node) to walk along. You do this using the uniform distribution on edges: if there are 3 edges connected to the node you're on (such as node 2), then all those edges have a $1/3$ chance of being picked.

So starting at node 2, after one time step you have a $1/3$ chance of being at nodes 1, 3 or 4. You then repeat this process at your new node, and keep going. This whole process is know as a random walk on the graph $G$.

This process can be represented using the random walk matrix $W$: if a graph has $n$ vertices, then $W$ is an $n \times n$ matrix with entries

$$ w_{ij} = \begin{cases} \frac{1}{d(j)} \quad \text{if } i \sim j \\ 0 \quad \text{otherwise.} \end{cases} $$

In other words, $w_{ij} = \frac{a_{ij}}{d(j)}$, using the notation $a_{ij}$ and $d(j)$ from the previous section.

Question 2

a) Write out the random walk matrices for the graphs $G,H,K$ above. Is the random walk matrix always symmetric?

b) Write a formula for $W$ in terms of $A$ and $D$.

If a graph has $n$ nodes, a probability vector is an $n$-dimensional vector

$$ p = \begin{pmatrix} p(1) \\ p(2) \\ \vdots \\ p(n) \end{pmatrix} $$

where $p(i)$ is the probability that the ant is at node $i$. Since these are probabilities, we have $0 \leq p(i) \leq 1$ and $p(1) + p(2) + \dots + p(n) = 1$.

c) Let $p$ be the initial probability vector for an ant on a graph, and suppose the ant takes a random walk on the graph. What is the probability vector after one time step? What about after $m$ time steps?

d) A probability vector is an equilibrium vector if the probability doesn't change under a random walk. Find an equilibrium vector for each of the graphs $G,H,K$. Can you deduce a formula for the equilibrium vector of an arbitrary graph?

Question 3

This question is about matrices in general, and their spectrum. We will apply this result to graphs next time, but for now you can just think of these are arbitrary matrices.

Let $M$ be a square matrix, and let $I$ denote the identity matrix of the same shape.

a) If $v$ is an eigenvector of $M$ with corresponding eigenvalue $\lambda$, show that $v$ is also an eigenvector of $M + I$. What is its eigenvalue?

b) More generally, show that $v$ is an eigenvector of $aM + bI$, where $a$ and $b$ are constants. What is the corresponding eigenvalue?

Let $A$ and $B$ be square matrices of the same shape. We say that $A$ and $B$ are similar if there is some invertible matrix $P$ such that

$$ A = P^{-1} B P. $$

c) Suppose $\lambda$ is an eigenvalue of $A$, with corresponding eigenvector $v$. If $A$ and $B$ are similar, show that $\lambda$ is also an eigenvalue of $B$. What is the corresponding eigenvector?

Something to think about

Here are some possible graphs and networks that we discussed:

  • Air traffic graphs (nodes are airports)
  • Sea traffic graphs (nodes are ports)
  • Transport graphs (nodes are bus or train stations)

Note that I only wrote what the nodes are, but not what the edges are. This is because there can be more than one way of representing these things as graphs! Can you think of a few?

Also, you can play one level of the Mini Metro game here for free: https://www.coolmathgames.com/0-mini-metro-london

It's a game about building a metro/MRT network to connect people in a city. If you do play the game, take a screenshot of your high score! Are there some strategies you adopt? (e.g. are circles or hub-and-spokes better?)