Abstract
Graphviz is an awesome tool for software documentation and visualizing graphs. In this post I explain the core concepts that you need to get started with Graphviz, with examples.
Boxes and Lines
If you studied some form of computer science or software engineering, you would have come across class diagrams. You also would have encountered flow charts. If you then went into the business world, you would have encountered the corporate alternative: boxes with words in them joined arbitrarily by lines, usually on a whiteboard. I've been drawing a few of those myself recently.
I want to introduce to you an open source tool named Graphviz. It allows you to create diagrams of the sort that you would have been writing up on the whiteboard. You tell it what your boxes are, and how to connect them. It outputs a picture.
Graphs
"Graph" is unfortunately an ambiguous word. When I hear it, I can't help but think of a chart with an X axis and a Y axis. That is not what I mean by graph in this post.
A graph, in the context of computer science, is a data structure. A graph has a collection of "nodes" which are joined by "edges".
This simple data structure, when drawn out on paper, can convey many different situations. It can be how computers are connected in a network (a network topology diagram). It can be how the steps in an algorithm relate to each other (a flow chart). It can be the the relationships between different classes in an object oriented system (a class diagram). If each node is a concept and the edges are how they relate to each other, then you have a mind map.
Graphs are especially useful as a visualization tool, since many people understand the meaning of boxes connected by lines.
Minimal documentation
A picture speaks a thousand words but in the land of software development, in my personal experience at least, we spend very little time actually drawing well considered pictures. Documentation easily descends into hastily written and awkwardly phrased paragraphs. I know that I am personally guilty of writing some pretty nasty prose to explain a feature under time pressure.
Sometimes, just a simple picture of the high level idea can make a big difference to how easy something is to understand.
Enter Graphviz
Graphviz is an open source tool where you write a text description of your graph (this is usually quite concise) and have it generate the picture for you. You can then take the picture and embed it in documents, email it around, or really use it however you want.
To get started, you can download Graphviz from Graphviz' web site, or from your system's package manager.
If you just want to try it out without installing anything, there is an online version available.
If you're an Emacs user, love keeping notes in org-mode like me, and use Babel for embedding and running code blocks in your notes, then you should enable Graphviz in your org-babel config.
Cheat Sheet
All of the examples below, you put into a text file with the extension
.dot
. The executable that you need to run is also called dot
.
For example, to create a png out of the file hello.dot
, you would
run
dot -Tpng -ohello.png hello.dot
I think the best way to give an idea of what you might do with Graphviz is through some examples.
The Basics
Graphs can be either undirected
graph { // C-style comments work A -- B }
or directed
digraph {
A -> B
}
Layout Direction
If you don't want your graph to be laid out from top to bottom, it can be laid out in any direction. My usual choice is left to right.
digraph { // rankdir is a graph-level attribute rankdir="LR" // you can declare multiple edges on one line A -> B -> C }
Attributes
You can separate node declarations from their logic if they're more complicated.
digraph { rankdir="LR" // square brackets hold attributes A [ label="First Node" shape="circle" ] B [ label="Second Node" shape="square" ] // edges have attributes too A -> B [ arrowhead="halfopen" ] }
Ranks and Subgraphs
To work out the layout, Graphviz uses a system it calls "ranks". Each
node is assigned a higher rank than the highest ranked node that point
to it. If your rank direction is set to left to right (rankdir=LR
),
then nodes with a higher rank are placed further to the right.
Notice how all of the arrows on this graph point right.
digraph { rankdir="LR" // Nothing points to A, so its rank is 1 // A points to B, so B's rank is 2 A -> B // B's rank is higher than A's, so C gets a rank // of 3 to be higher than B's rank A -> C B -> C // D gets a rank of 4 C -> D }
You can use subgraphs to logically group nodes. Importantly for layout, you can force all nodes in a subgraph to have the same rank.
digraph { rankdir="LR" A -> B A -> C B -> C C -> D subgraph subs { rank="same" B C } }
Clusters
If your subgraph name starts with 'cluster', then it actually gets a line around it. You can even put a heading on a cluster.
digraph { rankdir="LR" // the normal ranking algorithm doesn't know what to // do with clusters with rank="same". // If you opt in to the 'new' ranking algorithm, it // works as expected. newrank="true" A -> B A -> C B -> C C -> D subgraph cluster_subs { label="Bs and Cs" rank="same" B C } }
Records
The 'record' node shape is great for UML class diagrams.
digraph { rankdir="RL" // fields in the label are separated by a | // ending a line in \l left aligns it Hero [ shape="record" label="Hero|+ health : int\l|+ save(k : Kingdom) : bool\l" ] // do whatever you want with whitespace Villain [ shape="record" label="Villain|+ health : int\l|+ brood() : void\l" ] Character [ shape="record" ] Hero -> Character [ arrowhead="empty" ] Villain -> Character [ arrowhead="empty" ] }
If your rankdir is vertical, then you need to use {}
to change the
record type's direction.
digraph { rankdir="BT" Hero [ shape="record" label="{Hero|+ health : int\l|+ save(k : Kingdom) : bool\l}" ] Villain [ shape="record" label="{Villain|+ health : int\l|+ brood() : void\l}" ] Character [ shape="record" ] Hero -> Character [ arrowhead="empty" ] Villain -> Character [ arrowhead="empty" ] }
Further Reference
For more information on the dot language, and how to make best use of it, check the official dot documentation and the official list of all supported attributes.
Bonus: COLOURS
You can specify all of the colours used in the graph on a per node and edge basis.
digraph { rankdir="BT" bgcolor="#222222" // defaults for edges and nodes can be specified node [ color="#ffffff" fontcolor="#ffffff" ] edge [ color="#ffffff" ] 2 [fillcolor="#f22430" style=filled color="#000000" fontcolor="#000000"] 4 [fillcolor="#f22430" style=filled color="#000000" fontcolor="#000000"] 5 [fillcolor="#f22430" style=filled color="#000000" fontcolor="#000000"] 6 [fillcolor="#f22430" style=filled color="#000000" fontcolor="#000000"] 7 [fillcolor="#f22430" style=filled color="#000000" fontcolor="#000000"] 1 -> 2 1 -> 3 1 -> 5 2 -> 4 [color="#f22430"] 2 -> 6 [color="#f22430"] 3 -> 4 3 -> 7 4 -> 8 5 -> 6 [color="#f22430"] 5 -> 7 [color="#f22430"] 6 -> 8 7 -> 8 }