A Simple Tidy Tree Layout Implementation in Python

A Simple Tidy Tree Layout Implementation in Python

February 16, 2025

I bet you need a tidy tree layout

Trees are everywhere, not only in forests, but also in computer science, biology, data science and other sciences. Some see science as a whole as one big tree with ever-ramifying branches of knowledges.

Random trees
Some tidily laid out random trees

When it comes to visualizing them, the “tidier” layout introduced by Reingold and Tilford in 1981 is a time-proven way to place nodes in a clean and aesthetically pleasing manner. While the Reingold-Tilford algorithm or its variations have been implemented in many libraries, I was not able to find an implementation that would be:

  • a) written in pure Python (i.e., not depending on Graphviz or other tools)

  • b) written in a simple way emphasizing the recursive nature of the algorithm and

  • c) compatible with NetworkX, which is my go-to library for all things graph-related.

This article attempts to fill this gap with a Python tidy tree layout implementation prioritizing intuition and simplicity over computational efficiency.

Tree of life
A very approximate and heavily pruned 'tree of life', one of the many tree structures in nature.

Abstract syntax tree
Abstract syntax tree of the Python code below, one of the many tree structures in engineering.

Four aesthetics

The Reingold-Tilford “tidier” layout is based on the assumption of the four following “aesthetics”. I am quoting from the original paper, which refers to parent and child nodes as “father” and “sons”.

  • Aesthetic 1:

Nodes at the same level of the tree should lie along a straight line, and the straight lines defining the levels should be parallel.

This is a very general heuristic, corresponding to the concept of “layered layout” which can more generally be applied to directed graphs.

  • Aesthetic 2:

A left son should be positioned to the left of its father and a right son to the right.

This implies a predefined ordering of children, which may be natural in some cases (e.g. a decision tree where the children corresponding to one side of the split are always on the same side) but not in others.

  • Aesthetic 3:

A father should be centered over its sons.

This makes it easier to recognize parent-child relationships and also involves symmetry in the case of binary trees.

  • Aesthetic 4:

A tree and its mirror image should produce drawings that are reflections of one another; moreover, a subtree should be drawn the same way regardless of where it occurs in the tree.

While the 3 previous aesthetics had already been considered by previous authors (Wetherell, C., & Shannon, A. (1979). Tidy drawings of trees. IEEE Transactions on software Engineering, (5), 514-520.), this fourth aesthetic was introduced by Reingold and Tilford, and justified by examples of symmetric trees weirdly drawn in an asymmetric way by the old algorithm.

A 3-step simplified algorithm

The algorithm proposed by Reingold and Tilford perfectly fulfills the four aesthetics presented above, but it is not really straightforward, involving a concept of threads which does not appeal to my personal intuition. Instead, we will start from the fundamental heuristic on which the Reingold-Tilford algorithm is based: two subtrees of a node should be formed independently, and then placed as close together as possible… and keep it so simple, you will not even need to know what post-order traversal means.

Full tree, left subtree, right subtree
Full tree, left subtree, right subtree. The layout of the full tree is composed of the two subtree layouts, juxtaposed as close together as possible.

Following this idea, to determine a layout for a given tree with root node R, all you need to do is:

  • determine a layout for each subtree $S_i$ descending from a child of R. Each of these subtrees is itself a tree but strictly smaller than the original tree, so you can use recursion to get each subtree. Recursion ends when a subtree has only one node, in which case you can trivially place it at the origin.
  • shift the resulting subtree layouts by just the right amount, such that the right contour of subtree $S_i$ is at a distance of one unit from the left contour of subtree $S_{i+1}$.
  • finally, place the root node centered above its children.

Right contour of the left subtree and left contour of the right subtree
Right contour of the left subtree and left contour of the right subtree. These two contours should not intersect.

The code

"""Tidy tree layout following the Reingold-Tilford aesthetics"""

from collections import defaultdict
from itertools import zip_longest
from typing import Any, Dict, List, Optional, Tuple

import networkx as nx

MIN_DISTANCE = 1


def tidy_tree_layout(
    tree: nx.DiGraph, from_node: Any = None, level: int = 0
) -> Dict[Any, Tuple[float, int]]:
    """Calculate a tidy layout for a tree following the Reingold-Tilford aesthetics

    Args:
        tree: a tree in the form of a networkx.DiGraph
        from_node: the node from which to start the layout
        level: the level of the node from which to start the layout
            0 for the root, 1 for the first level, etc

    Returns:
        pos: a dict with node positions in the form {node: (x, y)}
            as used by networkx.draw_networkx
    """
    if from_node is None:
        from_node = find_tree_root(tree)

    pos = {from_node: (0.0, -level)}
    children = sorted(list(tree.successors(from_node)))
    if len(children) == 0:
        # leaf node
        return pos

    # calculate (yet unshifted) positions for children subtrees
    for child in children:
        child_pos = tidy_tree_layout(tree, child, level + 1)
        pos = {**pos, **child_pos}

    # shift children subtrees along the x axis
    previous_contour = None
    for left_child, next_child in zip(children, children[1:]):
        move_next_by, previous_contour = _determine_shift(
            tree, pos, left_child, next_child, previous_contour
        )
        subset_to_move = [next_child] + list(nx.descendants(tree, next_child))
        pos = _translate_pos(pos, x_shift=move_next_by, node_subset=subset_to_move)

    # reposition node in the middle of its children
    x_mid = (pos[children[-1]][0] + pos[children[0]][0]) / 2
    pos[from_node] = (x_mid, -level)

    return pos


def find_tree_root(tree: nx.DiGraph):
    """Find the root of a tree"""
    roots = [node for node in tree.nodes if tree.in_degree(node) == 0]
    assert len(roots) == 1, f"tree should have one root, got {roots}"
    return roots[0]


def _translate_pos(
    pos: Dict[Any, Tuple[float, int]],
    x_shift: float = 0.0,
    y_shift: int = 0,
    node_subset: Optional[List[Any]] = None,
):
    """Translate positions in pos by x_shift along x and y_shift along y"""
    if node_subset is None:
        node_subset = list(pos.keys())
    for node in node_subset:
        pos[node] = (pos[node][0] + x_shift, pos[node][1] + y_shift)
    return pos


def _subtree_contour(
    tree: nx.DiGraph,
    pos: Dict[Any, Tuple[float, int]],
    from_node: Any,
    left: bool = True,
) -> List[float]:
    """Get the contour of a subtree

    Args:
        tree: a tree in the form of a networkx.DiGraph
        pos: a dict with node positions in the form {node: (x, y)}
        from_node: the node from which to start the contour
        left: true for the left contour, false for the right contour

    Returns:
        a list of x-coordinates of the contour
    """
    nodes_by_level = defaultdict(list)
    node_levels = nx.shortest_path_length(tree, source=from_node)
    for node, level in node_levels.items():
        nodes_by_level[level].append(node)

    def _x_coord(node):
        """Get x coordinate of node"""
        return pos[node][0]

    contour = []
    for level in sorted(nodes_by_level.keys()):
        level_nodes = sorted(nodes_by_level[level], key=_x_coord)
        if left:
            extreme_node = level_nodes[0]
        else:
            extreme_node = level_nodes[-1]
        contour.append(_x_coord(extreme_node))

    return contour


def _determine_shift(
    tree: nx.DiGraph, pos, left_child, next_child, previous_contour=None
) -> Tuple[float, List[float]]:
    """Determine how much next child should be shifted to the right away from left child"""
    left_child_contour = _subtree_contour(tree, pos, left_child, left=False)
    if previous_contour is not None:
        previous_contour = [
            max(new, prev)
            for new, prev in zip_longest(
                left_child_contour, previous_contour, fillvalue=-1000
            )
        ]
    else:
        previous_contour = left_child_contour
    next_child_contour = _subtree_contour(tree, pos, next_child, left=True)
    margin = min(
        right - left for left, right in zip(previous_contour, next_child_contour)
    )
    move_next_by = MIN_DISTANCE - margin
    return move_next_by, previous_contour

As written above, the code presented here prioritizes intuition and simplicity over computational efficiency. Can you see why? Please do not use it to compute layouts for trees with thousands of nodes.

References