# Force-directed graph drawing

This article uses the MathML standard to display mathematical equations. It is however not supported by most browsers yet. Consider using Mozilla Firefox which has the best support for it.

## Introduction

### Motivation

Graph drawing [gra01] is a computationally difficult topic. No efficient algorithm can be found to represent a graph in the visually most appealing way. A graph consists of vertices and edges which have to be positioned. Typical criteria for good graph representations are

• Low number of intersection of edges
• Vertices are distributed uniformly among the plane or in comprehensible symmetric geometric objects
• Uniform edge length
• Vertices should not be collinear

Force-directed graph drawing (FDG) is a layouting algorithm for graphs. We define forces (in an “update function”) to move vertices towards a direction and iteratively come to better appealing results. Once certain constraints (we call it “criterion”) are satisfied we terminate the algorithm and accept the intermediate result as final result.

The initial_graph is the problem provided as input to our algorithm. Our goal in this tutorial is to discuss criterion and update functions. I will provide a framework so we can focus on those functions and neglect implementation components like visualization.

Wikipedia [wik01] lists 6 advantages of a force-directed graph drawing approach to other alternatives:

Good-quality results
real-world implementations; uniform edge length, uniform vertex distribution and showing symmetry as criteria
Flexibility
can be adjusted to further constraints or more dimensions
Intuitive
easy to predict and understand
Simplicity
few lines of code
Interactivity
pull several nodes out of their equilibrium state and watch migration back into position
Strong theoretical foundations
theory from physics, optimization & statistics research in math

Among the disadvantages the following items are listed:

High running time
We vaguely estimate the running time of FDG algorithms: Given a graph with n vertices, approximately n iterations are required. In every iteration one vertex has to consider all other n (or n-1) vertices to determine the force [01]. Those three n yield a runtime of $Ο\left(n³\right)$.
Poor local minima
Depending on the definition of the criterion, certain special graphs might satisfy the criterion very early (local maximum) even though better layouts (global maximum) exist.

### Requirements

What do you need?

• Interest in FDG layouting algorithms
• Understanding of programming and python3 in particular
• We will revise fundamentals of trigonometry briefly
• Graph theory is not important at all. Physics might be helpful and inspiring for your own algorithms.
• Approximately one hour to go through this tutorial. You might play around with the framework afterwards on your own.

The framework requires the following software

On debian-based systems (Ubuntu, Damn Small Linux, Grml, Xandros) you can install those packages using:

sudo apt-get install python3 python3-pyside

## Mathematical preliminaries redux

### Coordinate system

The coordinate system is defined by Qt [qtc01] using positive x to move horizontally to the right and positive y to move vertically downwards.

However I mapped the coordinate system to apply the usual coordinate system used in geometry. So if you want to specify the position of a vertex, you have to use the following coordinate system: Bottom-Left is coordinate (0, 0) and all positions are positive. The coordinate of the position Top-Right is given as (width, height).

If you specify a motion, you don't specify the target position, but a relative position. For example a position <5, 5> and motion of <-2, 1> yields the new position <3, 6>. As you can see in that coordinate system negative values are also possible.

Coordinates in both systems are specified with Vector instances.

### Trigonometry and Geometry

To specify my desired graphs I also had to use basic trigonometry. I always computed the center point on my canvas and put my vertices along a circle around this central point. So the question is:

Given a central point, a radius and an angle in degree, compute the xy-coordinate of the point lying on a circle specified by a central point and a radius with the given angle.

We can use the following definition of sine and cosine:

$\sin(\alpha) = \frac{\text{opposite side}}{\text{hypotenuse}}$

The length of the opposite side (a) is the y value of C. The length of the adjacent side (b) is the x value of C. Given angle with 41.33°, its sine is:

Python 3.4.0 (default, Apr 11 2014, 13:05:11)
[GCC 4.8.2] on linux
>>> import math
0.6603949384518162

Be aware that python3's math.sin accepts radians as parameter but not degrees. Hence we use math.radians to convert it (which is basically a multiplication with $\frac{\pi}{180}$).

>>> math.cos(math.radians(41.33))
0.7509184544724027

Now we can use that to compute x and y:

Hence point C has coordinates (3.064177772475912, 2.571150438746157)

### Simple higher-degree polynomials

I am pretty sure you know that from high school:

We only look at simple polynomials. The polynomials we will use and their corresponding plots are given below. Roughly speaking a linear function states that one step in one dimension, will always lead to one step in the other dimension. A quadratic or cubic function will not take one step in the other dimension, but several depending on the size of the first step.

## Implementation

This tutorial comes with a framework. Several python3 files are given. The framework is provided because we don't want to be concerned with visualization in this tutorial. We want to focus on the definition of update and criterion functions. Also the discussed concepts here are already available as functions in the framework.

Properties:

• This framework only works with two dimensions. A Vector implementation is used to store coordinates.
• We only use discrete values; hence integer division is always rounded and will result in asymmetric structures. This lacks beauty, but simplifies the math and in the end also the canvas is a discrete raster.
• If you want to download the whole project including the article, you can download the complete package as zip or as tar.bz2. An API documentation is also included.

### Vector

An abstraction for two dimensional vectors.

Create a vector:

v = Vector(int x, int y)

Retrieve length of vector:

len(v)

The length of a vector on a canvas or plane is defined by

Retrieve unit vector of the given vector (same direction, length 1):

v.unit()

Access its coordinates:

v.x
v.y

### Graph

A graph is a tuple of vertices and edges.

• Vertices are specified as dictionary associating a vertex identifier with a Vector specifying the position of this vertex.
{ int : Vector(x, y) }
• Edges are specified as set of tuples. Each tuple contains the two vertex identifiers which are connected. We consider undirected edges here.
{ (vertex1, vertex2) }

For example the Petersen graph is specified by the following data structure:

({0: <460.000000, 200.000000>, 1: <349.000000, 352.000000>, 2: <171.000000, 294.000000>, 3: <171.000000, 106.000000>, 4: <349.000000, 48.000000>, 5: <380.000000, 200.000000>, 6: <324.000000, 276.000000>, 7: <236.000000, 247.000000>, 8: <236.000000, 153.000000>, 9: <324.000000, 124.000000>}, {(0, 5), (0, 1), (5, 7), (1, 6), (1, 2), (6, 8), (2, 7), (2, 3), (7, 9), (3, 8), (3, 4), (8, 5), (4, 9), (4, 0), (9, 6)})

where <x, y> specifies the xy-coordinates of a Vector instance respectively.

### Motion specification

A motion specification (motion_spec) is an association of every vertex of the graph to a Vector instance specifying in which direction the vertex should move in the next iteration.

{ int: Vector }

One example would be to move all vertices of the Petersen graph 1 pixel towards bottom-left:

{ 0: <-1, -1>,
1: <-1, -1>,
2: <-1, -1>,
3: <-1, -1>,
4: <-1, -1>,
5: <-1, -1>,
6: <-1, -1>,
7: <-1, -1>,
8: <-1, -1>,
9: <-1, -1> }

### criterion interface

A criterion specifies when a graph update [won't] be performed, because a local minimum has been reached. Criteria are implemented as functions which take four parameters:

graph
The graph before the iteration.
motion_spec
A motion specification which was used in the last iteration ({} in the first iteration).
iterations
How many iterations have been executed so far? This parameter contains a continuous integer.
cum
A vector cumulating the total distance of all motion specification vectors of previous iterations so far.

### update interface

Given a graph, which motion should every vertex perform for the next iteration? Updates are implemented as functions which take four parameters:

graph
The graph before the iteration.
vertex
The vertex (as identifier) to consider.
width
The width of the canvas.
height
The height of the canvas.

## Our first run

Now we want to learn to use the framework. We will use the update function to compute forces between components and compute motion specifications which will be applied on vertices.

First of all we will execute python3 to run the current implementation.

\$ python3 draw.py

The following window should appear:

Per default, a random graph is created and as the name indicates, your random graph will most likely look different than mine. If you click on “Update”, the next iteration will be computed and visualized. Okay, let's see which graph, criterion and update function is used:

def main():
"""Main routine."""
logging.basicConfig(level=logging.INFO)

# fdg run
width = 600
height = 400

def cont():
"""One FDG iteration."""
graph = graphs.random_graph(width=width, height=height)
criterion = criteria.avg_inside_criterion
count = 0

for graph in fdg.run(graph, update, criterion, width, height):
count += 1
yield (True, graph, count)

while True:
count += 1
yield (False, graph, count)

# GUI initialization
app = QtGui.QApplication(sys.argv)
window = GraphDrawing(cont, width, height)

window.show()
return app.exec_()

The main function defines and sets some parameters, defines the cont function and initializes the GUI. GUI initialization happens with the cont function and the canvas dimensions as parameters. GraphDrawing is defined above and not shown in this snippet. Then the Qt event loop is started.

The cont function defines which initial graph, update function and criterion will be used. Predefined ones are stored respectively in:

• graphs.py
• criteria.py

We will now use other parameters. We replace

        update = updates.center_update
graph = graphs.random_graph(width=width, height=height)
criterion = criteria.avg_inside_criterion

with

        update = updates.const_update
graph = graphs.circle_graph(nodes=5, width=width, height=height)
criterion = criteria.fifty_criterion

The following video shows how the implementation behaves now:

Instead of a random graph we use a graph where 5 vertices are positioned in a circle. The update function is defined as follows and implements the update interface:

def const_update(graph, vertex, width=600, height=400):
return Vector(-1, -1)

The criterion function is implemented as follows (and correspondingly implements the criterion interface) (docstrings have been omitted):

def fifty_criterion(graph, motion_spec, iterations, cum):
return iterations < 50

The criterion always allows exactly 50 iterations. This might be helpful for testing the functions, but not to find good graph representations.

## Update functions

### Vertex rejection

Obviously the constant update function does not provide us beautiful graphs. We define theorems, implement them and look at random graphs to improve upon our update function.

A vertex is rejected by other vertices. The rejection is proportional to the distance to all other vertices.

The rejection direction r of a vertex v is defined as follows:

Intuitively, the rejection is defined as sum of vertices pointing from any vertex w to the given vertex v normalized to length 1. In code it looks like this:

def simple_vertex_rejection(graph, vertex, width=600, height=400):
cum = Vector(0, 0)
for w, pos in graph[0].items():
if w != vertex:
cum = graph[0][vertex] - pos

return cum.unit()

But the result does not look promising:

You should recognize that force only rejects components from each other. Nothing attracts them. So vertices move away from each other. Before we fix that, we apply a different theorem.

A vertex is rejected by other vertices. The rejection is quadratic to the distance to all other vertices.

The corresponding equation and code is defined as follows:

def quadratic_vertex_rejection(graph, vertex, width=600, height=400):
cum = Vector(0, 0)
for w, pos in graph[0].items():
if w != vertex:
cum = graph[0][vertex] - pos

if len(cum) == 0:
return Vector(0, 0)

return (cum / len(cum)**2)

Nothing really happens. In general vertices close to each other should move more than vertices far away. The force we defined is too small. Most of the times the computed motion of a vertex does not even exceed the 1-pixel threshold. We can simply apply a factor like 100, but we will discuss that later on.

### Edge attraction

To define attraction we apply the following theorem:

Vertices connected by an edge are attracted by each other. Edge length should converge towards some distance dependent on the canvas width or height and number of edges or vertices.

I claim the following theorem:

The perfect uniform edge length is given as the distance between two vertices if the vertices are distributed uniformly in a grid on the canvas.

Let's discuss that more visually:

The question raised in Figure 16 is unsolvable. Squares do not necessarily cover the whole canvas. We cannot determine the number of rows, number of columns, number of additional vertices and finally the size of the squares. The problem is under-determined.

We consider a different approximation: Consider width $w$ and height $h$ as ratio of rectangles. We divide both values by the greatest common divisor yielding ${w}_{g}$ and ${h}_{g}$. If the product ${h}_{g}·{w}_{g}$ is too large, we divide both values by 2 and round down. If the product is too small, we multiply both values by 2.

This approximation might not be perfect and share your thoughts on that, but I think it suffices and it provides good performance measures. Let us write the corresponding update function.

The direction of the motion is given by the adjacent vertex of an edge. The length is given by the distance between both vertices minus the perfect edge length. We don't want the vertex to jump directly towards the perfect edge length. Therefore we only take 10% of the mentioned length.

We can observe in the video that two vertices connected by a vertex converge towards a certain length. After infinite iterations, it jumps between two positions because the vertex of degree 2 is attracted in two directions does not find a perfect position to achieve the desired length. Vertices without edges are kept the same way.

Edge attraction is implemented in the functions uniform_edge_attraction and the helper function compute_uniform_edge_attraction. We use functools.lru_cache [lru01] to enable memoization [02].

### Wall rejection

We ignored a problem. Vertices might be rejected and walk outside the canvas. We want to avoid that, because all vertices should always be visible.

All vertices are rejected from the wall in an elliptical vignette.

If a point lies outside the ellipse, we want to compute a vector pointing towards the ellipse.

The general equation for points on an ellipse is given as

x and y refer to the x and y-coordinates of p. From trigonometry we know that

We express both equations by variable y and use equality:

And now we solve it by x. y follows from the second equation.

We have solved our problem and our rejection vector is given as vector between q and p. If the vertex is inside the ellipse, no force is applied. A vertex is inside an ellipse if the right-hand side of the ellipse equation is smaller 1. Greater 1 means outside.

We introduce the update function [03]:

def vertex_wall_rejection(graph, vertex, width=600, height=400):
a, b = width * 0.4, height * 0.4
c = Vector(int(width // 2), int(height // 2))
q = graph[0][vertex] - c

# inside or outside?
rhs = q.x**2 / a**2 + q.y**2 / b**2
if rhs <= 1:
return Vector(0, 0)

# determine p, given q
tan_alpha = q.y / q.x
x = math.sqrt(b**2 / (tan_alpha**2 * a**2 + b**2))
y = tan_alpha * x

# compute motion vector
motion = Vector(x, y) - q
return motion * 0.01

The grid_graph beautifully shows us which points are affected by this rejection. The implementation is given in updates.uniform_edge_attraction.

### Modelling Newton's theory of gravitation

Sir Isaac Newton's theory of gravitation is based on a formula:

where m1 and m2 define masses of two objects. r defines the radius or distance between those objects and G is the gravitational constant. Then F yields the force towards each other.

Just for fun, we want to apply the same principle to our vertices. If we want to apply this formula, we need to solve three issues:

• What is the mass of a vertex? I suggest we define the degree of a vertex to be its mass. A vertex connected to 3 other vertices has mass and degree 3. A vertex connected to no other vertices has mass and degree 0.
• Does F represent an attraction or rejection? As far as vertices with high degree are considered important they require a lot of space (their neighborhood is already filled with edges). Hence, I assume this formula determines vertex rejection.
• The gravitational constant 6.673×10-11 Nm/kg2 is not seriously appropriate. It refers to meters; we are using pixels. It uses kg; we are using number of adjacent vertices (i.e. degree). I think a value like the minimum of width and height of the canvas is more appropriate.

Time for a new update function:

def newton_update(graph, vertex, width=600, height=400):
def degree(v):
degree = 0
for edge in graph[1]:
if edge[0] == v or edge[1] == v:
degree += 1
return degree

motion = Vector(0, 0)
deg = degree(vertex)
g = min(width, height)

for v in graph[0]:
if v == vertex:
continue

deg2 = degree(v)
diff = graph[0][vertex] - graph[0][v]
dist = len(diff)
force = g * deg * deg2 / dist

motion += diff.unit() * force

return motion

If we add wall rejection and edge attraction, this might result in a nice force-directed graph drawing algorithm.

## Criterion functions

Finding good criteria is a difficult task. If we allow too little iterations, a suboptimal solution will be returned. If we allow too many iterations, the running time is high. The following criteria are my suggestions:

constant_criterion
A fixed number of iterations will certainly always lead to a result without exceeding a runtime threshold. But if the number of vertices varies, a fixed number of iterations will certainly lead to a bad result. I only recommend this criterion for debugging purposes.
average_motion_criterion
Given a motion specification of one iteration. Sum up the total distance of the motion. If the distance is under the limit, terminate. I consider this criterion as the most useful and pragmatic.
maximal_motion_criterion
Similar to the average_motion_criterion you can also require that the greatest motion vector of one iteration exceeds a certain length.
structural criteria
If you have domain-specific graphs, where the number of vertices is constant or a certain structure is desirable, you can introduce criteria like the following: the distance between vertices must be within a specific boundary. Or: All vertices must touch a circle with a certain border-width. In that particular case it is required that the update function asks all vertices to position themselves in a circle.
Certainly those criteria require special constraints regarding the update function. Therefore those criteria are not provided as implementation in the framework.

A good idea is to combine those criteria. For example we use the average_motion_criterion, but if the number of iteration exceeds 200 iterations, the graph layout is returned in any case (this criterion is implemented in criteria.average_motion_maximal_iterations_criterion).

## Putting it together

We have looked at various update functions, criteria and you can select from a variety of sample graphs. Use the framework to build your own force-directed graph drawing algorithm!