A Simple Tidy Tree Layout Implementation in Python
I bet you need a tidy tree layout
Trees are everywhere, not only in forests, but also in computer science, biology, data science and other sciences. Some see science as a whole as one big tree with ever-ramifying branches of knowledges.
When it comes to visualizing them, the “tidier” layout introduced by Reingold and Tilford in 1981 is a time-proven way to place nodes in a clean and aesthetically pleasing manner. While the Reingold-Tilford algorithm or its variations have been implemented in many libraries, I was not able to find an implementation that would be:
a) written in pure Python (i.e., not depending on Graphviz or other tools)
b) written in a simple way emphasizing the recursive nature of the algorithm and
c) compatible with NetworkX, which is my go-to library for all things graph-related.
This article attempts to fill this gap with a Python tidy tree layout implementation prioritizing intuition and simplicity over computational efficiency.
Four aesthetics
The Reingold-Tilford “tidier” layout is based on the assumption of the four following “aesthetics”. I am quoting from the original paper, which refers to parent and child nodes as “father” and “sons”.
- Aesthetic 1:
Nodes at the same level of the tree should lie along a straight line, and the straight lines defining the levels should be parallel.
This is a very general heuristic, corresponding to the concept of “layered layout” which can more generally be applied to directed graphs.
- Aesthetic 2:
A left son should be positioned to the left of its father and a right son to the right.
This implies a predefined ordering of children, which may be natural in some cases (e.g. a decision tree where the children corresponding to one side of the split are always on the same side) but not in others.
- Aesthetic 3:
A father should be centered over its sons.
This makes it easier to recognize parent-child relationships and also involves symmetry in the case of binary trees.
- Aesthetic 4:
A tree and its mirror image should produce drawings that are reflections of one another; moreover, a subtree should be drawn the same way regardless of where it occurs in the tree.
While the 3 previous aesthetics had already been considered by previous authors (Wetherell, C., & Shannon, A. (1979). Tidy drawings of trees. IEEE Transactions on software Engineering, (5), 514-520.), this fourth aesthetic was introduced by Reingold and Tilford, and justified by examples of symmetric trees weirdly drawn in an asymmetric way by the old algorithm.
A 3-step simplified algorithm
The algorithm proposed by Reingold and Tilford perfectly fulfills the four aesthetics presented above, but it is not really straightforward, involving a concept of threads which does not appeal to my personal intuition. Instead, we will start from the fundamental heuristic on which the Reingold-Tilford algorithm is based: two subtrees of a node should be formed independently, and then placed as close together as possible… and keep it so simple, you will not even need to know what post-order traversal means.
Following this idea, to determine a layout for a given tree with root node R, all you need to do is:
- determine a layout for each subtree $S_i$ descending from a child of R. Each of these subtrees is itself a tree but strictly smaller than the original tree, so you can use recursion to get each subtree. Recursion ends when a subtree has only one node, in which case you can trivially place it at the origin.
- shift the resulting subtree layouts by just the right amount, such that the right contour of subtree $S_i$ is at a distance of one unit from the left contour of subtree $S_{i+1}$.
- finally, place the root node centered above its children.
The code
"""Tidy tree layout following the Reingold-Tilford aesthetics"""
from collections import defaultdict
from itertools import zip_longest
from typing import Any, Dict, List, Optional, Tuple
import networkx as nx
MIN_DISTANCE = 1
def tidy_tree_layout(
tree: nx.DiGraph, from_node: Any = None, level: int = 0
) -> Dict[Any, Tuple[float, int]]:
"""Calculate a tidy layout for a tree following the Reingold-Tilford aesthetics
Args:
tree: a tree in the form of a networkx.DiGraph
from_node: the node from which to start the layout
level: the level of the node from which to start the layout
0 for the root, 1 for the first level, etc
Returns:
pos: a dict with node positions in the form {node: (x, y)}
as used by networkx.draw_networkx
"""
if from_node is None:
from_node = find_tree_root(tree)
pos = {from_node: (0.0, -level)}
children = sorted(list(tree.successors(from_node)))
if len(children) == 0:
# leaf node
return pos
# calculate (yet unshifted) positions for children subtrees
for child in children:
child_pos = tidy_tree_layout(tree, child, level + 1)
pos = {**pos, **child_pos}
# shift children subtrees along the x axis
previous_contour = None
for left_child, next_child in zip(children, children[1:]):
move_next_by, previous_contour = _determine_shift(
tree, pos, left_child, next_child, previous_contour
)
subset_to_move = [next_child] + list(nx.descendants(tree, next_child))
pos = _translate_pos(pos, x_shift=move_next_by, node_subset=subset_to_move)
# reposition node in the middle of its children
x_mid = (pos[children[-1]][0] + pos[children[0]][0]) / 2
pos[from_node] = (x_mid, -level)
return pos
def find_tree_root(tree: nx.DiGraph):
"""Find the root of a tree"""
roots = [node for node in tree.nodes if tree.in_degree(node) == 0]
assert len(roots) == 1, f"tree should have one root, got {roots}"
return roots[0]
def _translate_pos(
pos: Dict[Any, Tuple[float, int]],
x_shift: float = 0.0,
y_shift: int = 0,
node_subset: Optional[List[Any]] = None,
):
"""Translate positions in pos by x_shift along x and y_shift along y"""
if node_subset is None:
node_subset = list(pos.keys())
for node in node_subset:
pos[node] = (pos[node][0] + x_shift, pos[node][1] + y_shift)
return pos
def _subtree_contour(
tree: nx.DiGraph,
pos: Dict[Any, Tuple[float, int]],
from_node: Any,
left: bool = True,
) -> List[float]:
"""Get the contour of a subtree
Args:
tree: a tree in the form of a networkx.DiGraph
pos: a dict with node positions in the form {node: (x, y)}
from_node: the node from which to start the contour
left: true for the left contour, false for the right contour
Returns:
a list of x-coordinates of the contour
"""
nodes_by_level = defaultdict(list)
node_levels = nx.shortest_path_length(tree, source=from_node)
for node, level in node_levels.items():
nodes_by_level[level].append(node)
def _x_coord(node):
"""Get x coordinate of node"""
return pos[node][0]
contour = []
for level in sorted(nodes_by_level.keys()):
level_nodes = sorted(nodes_by_level[level], key=_x_coord)
if left:
extreme_node = level_nodes[0]
else:
extreme_node = level_nodes[-1]
contour.append(_x_coord(extreme_node))
return contour
def _determine_shift(
tree: nx.DiGraph, pos, left_child, next_child, previous_contour=None
) -> Tuple[float, List[float]]:
"""Determine how much next child should be shifted to the right away from left child"""
left_child_contour = _subtree_contour(tree, pos, left_child, left=False)
if previous_contour is not None:
previous_contour = [
max(new, prev)
for new, prev in zip_longest(
left_child_contour, previous_contour, fillvalue=-1000
)
]
else:
previous_contour = left_child_contour
next_child_contour = _subtree_contour(tree, pos, next_child, left=True)
margin = min(
right - left for left, right in zip(previous_contour, next_child_contour)
)
move_next_by = MIN_DISTANCE - margin
return move_next_by, previous_contour
As written above, the code presented here prioritizes intuition and simplicity over computational efficiency. Can you see why? Please do not use it to compute layouts for trees with thousands of nodes.
References
- The original 1981 article by Reingold and Tilford is of course worth reading: Tidier Drawings of Trees by Edward M. Reingold and John S. Tilford.
- A proof of the relevance of the Reingold-Tilford algorithm is that it is used as default tree layout in the powerful D3.js library, as described in the D3.js documentation.
- Kay Jan Wong wrote a nice explanatory article of the Reingold-Tilford algorithm: Reingold Tilford Algorithm Explained With Walkthrough. And she even implemented it in Python within her bigtree package.
- The first volume of Donald Knuth’s The Art of Computer Programming contains very interesting discussions of trees, which he calls “the most important nonlinear structures that arise in computer algorithms”.
- The Book of Trees: Visualizing Branches of Knowledge by Manuel Lima beautifully illustrates the ubiquity of trees in information visualization, in the 21st century as well and in past centuries.
- This article on Medium
- The Reingold-Tilford algorithm in my museum of data visualization (work in progress)