import plotly.express as px
gapminder = px.data.gapminder()
gapminder['neg_pop'] = -gapminder['pop']
fig1 = px.area(
gapminder.query("country == 'United States' or country == 'Canada'"),
x='year',
y='pop',
color="country",
color_discrete_sequence=['#636EFA', '#EF553B'])
fig2 = px.area(
(
gapminder
.query("country == 'China' or country == 'India'")
),
x='year',
y='neg_pop', color="country",
color_discrete_sequence=['#00CC96', '#AB63FA'])
fig1.add_traces(fig2.data)
fig1Plotly Tips
Here’s a demo of various Plotly features and hacks.
8show_plot(
px.scatter(votes, x='year', y='percent_yes', color='country',
1 facet_col='issue', facet_col_wrap=3, facet_col_spacing=.1,
2 trendline='lowess',
3 labels={"percent_yes": "% Yes Votes", "year": "Year", "country": "Country"},
4 title="Percentage of 'Yes' votes in the UN General Assembly"
)
5 .update_traces(marker_size=2)
6 .for_each_annotation(lambda a: a.update(text=a.text.split("=", 1)[-1]))
7 .update_yaxes(tickformat=",.0%"))- 1
-
Faceting: facet by
columns.wrapat 3 columns.spacingbetween columns. - 2
-
Trendline: add a
lowesstrendline. (Other options:olsfor a linear regression) - 3
- Labels: use human-readable labels for the axes and legend.
- 4
- Plot title.
- 5
-
update_tracessets attributes on everything that gets drawn.marker_size=2is shorthand formarker={'size': 2}. See Styling Markers - 6
-
Remove the
=from the annotation text. See Annotations - 7
- Tick format: use a percentage format for the y-axis. See Tick Formatting
- 8
- Hack for making plots show in RStudio. This also, incidentally, gives us an open paren, which is helpful because Python doesn’t try to think that the code is done until we close that paren at the end. See the Quarto page in these notes for details on this hack.
Plot types we’ve used
px.scatter: API referencepx.histogram: API referencepx.bar: API referencepx.box: API referencepx.violin: API reference
Ticks
Ticks are the little marks on the axes that show where the values are. You can customize them using update_xaxes and update_yaxes. Useful attributes:
nticks: set the number of ticks. Example:nticks=4tickformat: format the tick labels. Examples:tickformat=",.2f": commas, 2 decimal placestickformat=",.0%": percentage format with no decimal placestickformat=",.1%": percentage format with 1 decimal placetickformat="$,.0f": dollar format with no decimal places
tickvals: put ticks at specific values. Example:tickvals=[0, 1, 2, 3]ortickvals=gapminder['year'].unique()ticktext: use specific text for the tick labels. Example:ticktext=["zero", "one", "two", "three"]
More documentation and the full list of attributes.
Violins and Ridgeline Plots
Ridgeline plots are useful for visualizing how distributions change. You can make them in Plotly by tweaking a violin plot.
(
px.violin(data, x=..., y=...)
.update_traces(side = 'positive', width=5, points=False)
.update_yaxes(showgrid=True)
)See docs
Category ordering
Plotly’s default is to put categories in the order they first appear in the data. Sometimes this is exactly what you want; you can just sort the data in exactly the order you want the categories to appear (using sort_values, for example).
But sometimes you want to override this. Plotly provides two ways to control the order of categories on an axis: the category_orders argument to some px.* functions, and the categoryorder attribute of the axis.
If you know the order you want to use (or you’re mapping the categorical to color or something that doesn’t have a corresponding axis), you can use category_orders.
If you just want to sort alphabetically or by total, you can use the categoryorder attribute.
Useful ones:
"category ascending": sort alphabetically"category descending": sort reverse alphabetically"total ascending": sort by value"total descending": sort by value, descending
For more complex sorting, you can use a list of category names. You set categoryorder to "array" and then set categoryarray to a list of category names in the order you want them to appear.
Typical imports
import numpy as np
import pandas as pd
import plotly.express as pxTemplates
The default Plotly template puts a gray background behind the plot. You can change the template by setting pio.templates.default:
import plotly.io as pio
import plotly.express as px
pio.templates.default = "plotly_white"(You can also set a template on any individual px.* function by passing template="plotly_white" as an argument.)
Examples of working around limitations of Plotly Express
Sometimes you need to make a plot that’s not quite supported by Plotly Express. Here are some examples of how to do that.
Multiple traces on the same plot
When the layouts are compatible (i.e., same axes and such), you can just add the traces together.
Stacking Groups
Bars and areas can stack in “groups”. Plotly Express doesn’t seem to give an interface for this, but you can go in and change the traces after the fact.
gapminder_4_countries = gapminder.query("country == 'United States' or country == 'Canada' or country == 'China' or country == 'India'")
fig = px.area(
gapminder_4_countries,
x='year',
y='pop',
color='country',
)
for trace in fig.data:
if trace.name in ['China', 'India']:
trace.stackgroup = 2
trace.y = -trace.y
print(trace.name, trace.stackgroup)
figCanada 1
China 2
India 2
United States 1