Plotly Tips

Here’s a demo of various Plotly features and hacks.

8show_plot(
  px.scatter(votes, x='year', y='percent_yes', color='country',
1    facet_col='issue', facet_col_wrap=3, facet_col_spacing=.1,
2    trendline='lowess',
3    labels={"percent_yes": "% Yes Votes", "year": "Year", "country": "Country"},
4    title="Percentage of 'Yes' votes in the UN General Assembly"
  )
5  .update_traces(marker_size=2)
6  .for_each_annotation(lambda a: a.update(text=a.text.split("=", 1)[-1]))
7  .update_yaxes(tickformat=",.0%"))
1
Faceting: facet by columns. wrap at 3 columns. spacing between columns.
2
Trendline: add a lowess trendline. (Other options: ols for a linear regression)
3
Labels: use human-readable labels for the axes and legend.
4
Plot title.
5
update_traces sets attributes on everything that gets drawn. marker_size=2 is shorthand for marker={'size': 2}. See Styling Markers
6
Remove the = from the annotation text. See Annotations
7
Tick format: use a percentage format for the y-axis. See Tick Formatting
8
Hack for making plots show in RStudio. This also, incidentally, gives us an open paren, which is helpful because Python doesn’t try to think that the code is done until we close that paren at the end. See the Quarto page in these notes for details on this hack.

Plot types we’ve used

Ticks

Ticks are the little marks on the axes that show where the values are. You can customize them using update_xaxes and update_yaxes. Useful attributes:

  • nticks: set the number of ticks. Example: nticks=4
  • tickformat: format the tick labels. Examples:
    • tickformat=",.2f": commas, 2 decimal places
    • tickformat=",.0%": percentage format with no decimal places
    • tickformat=",.1%": percentage format with 1 decimal place
    • tickformat="$,.0f": dollar format with no decimal places
  • tickvals: put ticks at specific values. Example: tickvals=[0, 1, 2, 3] or tickvals=gapminder['year'].unique()
  • ticktext: use specific text for the tick labels. Example: ticktext=["zero", "one", "two", "three"]

More documentation and the full list of attributes.

Violins and Ridgeline Plots

Ridgeline plots are useful for visualizing how distributions change. You can make them in Plotly by tweaking a violin plot.

(
  px.violin(data, x=..., y=...)
  .update_traces(side = 'positive', width=5, points=False)
  .update_yaxes(showgrid=True)
)

See docs

Category ordering

Plotly’s default is to put categories in the order they first appear in the data. Sometimes this is exactly what you want; you can just sort the data in exactly the order you want the categories to appear (using sort_values, for example).

But sometimes you want to override this. Plotly provides two ways to control the order of categories on an axis: the category_orders argument to some px.* functions, and the categoryorder attribute of the axis.

If you know the order you want to use (or you’re mapping the categorical to color or something that doesn’t have a corresponding axis), you can use category_orders.

If you just want to sort alphabetically or by total, you can use the categoryorder attribute.

Useful ones:

  • "category ascending": sort alphabetically
  • "category descending": sort reverse alphabetically
  • "total ascending": sort by value
  • "total descending": sort by value, descending

For more complex sorting, you can use a list of category names. You set categoryorder to "array" and then set categoryarray to a list of category names in the order you want them to appear.

Typical imports

import numpy as np
import pandas as pd
import plotly.express as px

Templates

The default Plotly template puts a gray background behind the plot. You can change the template by setting pio.templates.default:

import plotly.io as pio
import plotly.express as px

pio.templates.default = "plotly_white"

(You can also set a template on any individual px.* function by passing template="plotly_white" as an argument.)

Examples of working around limitations of Plotly Express

Sometimes you need to make a plot that’s not quite supported by Plotly Express. Here are some examples of how to do that.

Multiple traces on the same plot

When the layouts are compatible (i.e., same axes and such), you can just add the traces together.

import plotly.express as px

gapminder = px.data.gapminder()
gapminder['neg_pop'] = -gapminder['pop']

fig1 = px.area(
  gapminder.query("country == 'United States' or country == 'Canada'"),
  x='year',
  y='pop',
  color="country",
  color_discrete_sequence=['#636EFA', '#EF553B'])

fig2 = px.area(
  (
    gapminder
    .query("country == 'China' or country == 'India'")
  ),
  x='year',
  y='neg_pop', color="country",
  color_discrete_sequence=['#00CC96', '#AB63FA'])
fig1.add_traces(fig2.data)
fig1

Stacking Groups

Bars and areas can stack in “groups”. Plotly Express doesn’t seem to give an interface for this, but you can go in and change the traces after the fact.

gapminder_4_countries = gapminder.query("country == 'United States' or country == 'Canada' or country == 'China' or country == 'India'")
fig = px.area(
  gapminder_4_countries,
  x='year',
  y='pop',
  color='country',
)
for trace in fig.data:
  if trace.name in ['China', 'India']:
    trace.stackgroup = 2
    trace.y = -trace.y
  print(trace.name, trace.stackgroup)
fig
Canada 1
China 2
India 2
United States 1