Python Packages for Data Visualization in 2025

Python Packages for Data Visualization in 2025

January 21, 2025

Python’s rich package ecosystem is perhaps its greatest strength, but also an endless source of doubt for its users. Am I really using the best available tool for data visualization? Having asked myself this question for 10 years, I am finally feeling ready to share my thoughts on this difficult subject. I will not only share my thoughts, but also support them with quantitative indicators and illuminate them with insightful visuals, making this (I hope) the best resource on the subject.

Python packages sized by numbers of stars on Github (mostly for decoration). Scroll down for more informative plots.

This story will start by answering the most pressing question with a simple decision tree. Next, we will look at quantitative indicators of package importance and maintenance health. Finally, the last section will devote a short paragraph to each package.

Which one should I choose?

If you are just here for a quick tip, you may quickly traverse the two levels of the following decision tree and walk away with a valuable suggestion. Still, it is advised to read on - and try out a few options.

flowchart LR;
    start@{ shape: circle, label: "Visualize data with Python" }
    staticOrDynamic@{ shape: diamond, label: "Static or dynamic?" }
    start-->staticOrDynamic;
    staticOrDynamic -->|static|whatData;
    whatData@{ shape: diamond, label: "Tell me one word" }
    whatData -->|anything|matplotlib;
    whatData -->|large dataset|datashader;
    whatData -->|statistical|seaborn;
    whatData -->|ggplot|plotnine;
    whatData -->|grammar|altair;
    matplotlib@{ shape: stadium, label: "Matplotlib" }
    datashader@{ shape: stadium, label: "Datashader" }
    seaborn@{ shape: stadium, label: "Seaborn" }
    altair@{ shape: stadium, label: "Altair" }
    plotnine@{ shape: stadium, label: "Plotnine" }
    staticOrDynamic -->|both|bothHowMuchTime;
    bothHowMuchTime@{ shape: diamond, label: "Got time?" }
    bothHowMuchTime -->|only 5 seconds|hvplot;
    bothHowMuchTime -->|a bit more|holoview;
    holoview@{ shape: stadium, label: "Holoviews" }
    hvplot@{ shape: stadium, label: "hvPlot" }
    staticOrDynamic -->|dynamic|dynamicControlDecision;
    dynamicControlDecision@{ shape: diamond, label: "Need full control?" }
    dynamicControlDecision -->|yes|bokeh;
    bokeh@{ shape: stadium, label: "Bokeh" }
    dynamicControlDecision -->|not necessarily|plotly;
    plotly@{ shape: stadium, label: "Plotly" }

The choice of data visualization package depends on your requirements. As illustrated by the following flowchart, a key question that will determine the answer is whether you need static visualizations (e.g. writing for Medium or other publications) or dynamic visualizations, which could be embedded in a website or dashboard. The type of data is of course a fundamental aspect of the question. Many visualization tools have a focus on tabular data - and offer shortcuts for the visualization of dataframes, think df.plot() in Pandas and similar methods in hvPlot (see below). Some other types of data, such as geospatial data or graph data, have specialized packages dedicated to them, which have not been considered here.

Data-driven indicators

You do not want to invest time and energy in a package that is not maintained any more, and maybe you trust that your fellow developers have some sort of collective intelligence steering them towards more useful packages. The following indicators quantify some of these notions.

PyPI downloads across time

When it comes to the evolution of downloads on the Python package index in the last years, it almost looks like every package is growing.

Number of PyPI downloads per year per package from 2020 to 2024
Number of PyPI downloads per year per package from 2020 to 2024

Still, the numbers span several orders of magnitude (hence my choice of a logarithmic scale). One thing to keep in mind here is that a package may be downloaded because a user wants to use it, or because it is a dependency of another package. The number of downloads of Seaborn will never reach that of Matplotlib, because every Seaborn installation requires Matplotlib. Also, I suspect that the sharp rise in Altair downloads may be partly due to it being a dependency of the popular Streamlit.

Multi-indicator heatmap

Numbers of forks and stars of GitHub provide different measures of a project’s popularity, which, as seen in the following shaded matrix, are correlated, but only to a certain degree. Finally, I find the number of contributors to a project and the numbers of commits across time to be revealing albeit simple indicators of a project’s health - which gets me slightly worried about Seaborn.

Multi-indicator heatmap for our ten selected Python data visualization packages
Multi-indicator heatmap for our ten selected Python data visualization packages

Bar charts

Select an indicator.

Individual profiles

Following are some more infos and opinions on each of the ten selected packages.

Matplotlib - The mighty Grandfather

: matplotlib.org/stable/ : matplotlib/matplotlib PyPI: Matplotlib

In short, use it… if you want to make static visualizations and only want to install one package

Official description

matplotlib: plotting with Python

My take on Matplotlib

It has been there for 22 years, and given the amount of complaints on its syntax and difficulty, you would think it should have been abandoned long ago, but this is not the case. It is actually by far the most downloaded Python data visualization package. Not only because it is the foundation of Seaborn and others, but also because it really lets you do anything – as far as static plots are concerned, and as long as you are willing to put in the effort.

Bokeh - Beauty and pain

: bokeh.org : bokeh/bokeh PyPI: Bokeh

In short, use it… if you want to make dynamic visualizations and have full control over them, and you do not mind writing a few more lines of code

Official description

Interactive Data Visualization in the browser, from Python

My take on Bokeh

Bokeh gives you the ability to produce beautiful interactive visualizations, but requires you to write many lines to do so. And you may have to hunt through numerous examples to figure out basic functionality. But Bokeh’s stability and documentation have greatly improved in the last years, and I have even heard about code being generated automatically, so why not go for it?

Plotly - The Enterprise Darling

: plotly.com/python/ : plotly/plotly.py PyPI: Plotly

In short, use it… if you want to make highly dynamic visualizations and make them fast

Official description

The interactive graphing library for Python :sparkles: This project now includes Plotly Express!

My take on Plotly

The balance between the simplicity of its API and its power at generating shiny interactive plots is really good. Still, things get a bit more cloudy when plotly.express is not enough and you need to dive into the graph_objects module. While open-source, the plotly package is the only one of the list to be mostly maintained by employees of a single company (Plotly is also the name of the company).

Seaborn - statistically friendly

: seaborn.pydata.org : mwaskom/seaborn PyPI: Seaborn

In short, use it… if you want to visualize statistical data

Official description

Statistical data visualization in Python

My take on Seaborn

Seaborn offers good-looking statistical visualizations with minimal effort, with a lot of predefined plot types. And if this is not enough, you can always fine-tune the resulting figures with Matplotlib. The only troubling thing is that the number of contributions to the repository seems to have stalled in 2024 (see above). Hopefully the project will continue to be maintained and improved.

Altair - The new Grammarian

: altair-viz.github.io/ : vega/altair PyPI: Altair

In short, use it… if the consistency and elegance of the API is the most important thing for you

Official description

Declarative visualization library for Python

My take on Altair

Your first reaction upon reading that “Altair is built on top of the Vega-Lite high-level grammar for interactive graphics which is based on the ‘grammar of graphics’ idea proposed by Leland Wilkinson” may not be one of great interest, at least if these terms do not ring a bell. But it does provide “a clear mental model based on a set of graphical primitives” – quite a relief coming from – say – Matplotlib. And it supports interactivity. Overall, quite a strong contender, and it seems to show in the number of downloads.

Plotnine - The R Grammarian

: plotnine.org : has2k1/plotnine PyPI: Plotnine

In short, use it… if you think ggplot2 is the best visualization library ever, but you need to use Python

Official description

A Grammar of Graphics for Python

My take on Plotnine

Plotnine brings R’s ggplot2 to Python. Arguably, this may not only be a matter of nostalgy, but also a question of how you think about plots, implementing the “grammar of graphics” concept… except that Altair also does that. Still, the number of PyPI downloads has been exploding in the last years (see above), so I might be missing something here.

Holoviews - The multi-backend Integrator

: holoviews.org : holoviz/holoviews PyPI: Holoviews

In short, use it… if you want to efficiently generate both static and dynamic visualizations

Official description

With Holoviews, your data visualizes itself.

My take on Holoviews

I like the idea of “bundling data together with the appropriate metadata”, so as to make visualization as automatic as possible. I also like the possibility to use both Matplotlib and Bokeh to render the actual plots. But the abstraction level gets even higher with hvPlot, so do I really need to bother about Holoviews?

Hvplot - The higher-level Integrator

: hvplot.holoviz.org : holoviz/hvplot PyPI: Hvplot

In short, use it… if you want both static and dynamic visualizations (see holoviews) but with less lines of code

Official description

A high-level plotting API for pandas, dask, xarray, and networkx built on HoloViews

My take on Hvplot

A convenient high-level plotting API that works with multiple backends. Is it the magic you have been waiting for? Maybe, until you need to dive into the documentation of multiple libraries (probably including Holowviews and either Bokeh or Matplotlib) to debug something or adjust a plot.

Datashader - The one for large datasets

: http://datashader.org : holoviz/datashader PyPI: Datashader

In short, use it… if the number of data points you need to visualize in one plot is really high

Official description

Quickly and accurately render even the largest data.

My take on Datashader

Datashader is quite good at what it is supposed to do: efficiently turning massive datasets into beautiful visualizations. But it requires you to think a bit differently, and understand its pipeline concept, which may not be for everybody.

Pyvista - The three-dimensional one

: docs.pyvista.org : pyvista/pyvista PyPI: Pyvista

In short, use it… if your data is actually three-dimensional

Official description

3D plotting and mesh analysis through a streamlined interface for the Visualization Toolkit (VTK)

My take on Pyvista

You could argue that 3D and scientific visualization are actually a different world, but I included PyVista as a representative from this world. It is pythonic enough to be considered as something more than an integration with VTK.

Forgotten ones

I did not include:

  • proplot, which proposed a wrapper around Matplotlib with a different API
  • toyplot, which is great if you never liked all those ticks
  • Pygal, which boasted “beautiful Python charting” and was indeed quite pretty, but does not seem actively maintained.
  • colorcet because it only gives your colormaps, which is useful but only auxiliary
  • PyWaffle, because it is really only about “waffle plots” and this is quite a restricted use case
  • and many more niche or unmaintained packages
  • as well as packages addressing adjacent domains, e.g. those for dashboarding, scientific visualization and cartography.

References and further reading