A Crash Course on Python Dataviz

Author

Eshin Jolly

Published

January 13, 2026

In this tutorial we’ll learn how to create statistical visualizations in Python using Seaborn. Just like polars gave us a consistent, intuitive way to work with DataFrames, seaborn gives us a consistent, intuitive way to visualize data.

Seaborn is built on top of matplotlib (another plotting library), but provides a much higher-level interface designed specifically for exploring and creating statistical visualizations.

seaborn, matplotlib…um what?

Python’s plotting ecosystem has a long history and started as an attempt to replicate plotting behavior in MATLAB. The first, and still foundational plotting library is called matplotlib. You can use it to create virtually any visualization you can imagine. The problem?

matplotlib requires you to think about how to draw things - coordinates, artists, axes, figures, patches, collections… It’s powerful but what we call low-level. It can be great for very quick plots, but for more complicated visuals it can get tedious fast.

seaborn is a much high-level plotting library that lets you think about what your data means - which column goes on which axis, how to split by categories, what relationships to highlight, and it handles the drawing details for you (similar to ggplot).

Fun fact: seaborn is developed and maintained by Michael Waskom a fellow Psych/Neuro PhD when he was a graduate student. It’s now the go-to library for statistical plotting.

How to use this notebook

This notebook is designed for you to work through at your own pace and use as a reference later.

As you go through this notebook, regularly consult the seaborn documentation.* Learning to quickly scan and read read API docs is a critical skill - the docs show you every parameter and option available.

*Note: we’re NOT using seaborn’s “Objects interface” which is even more like ggplot, but still has a several rough edges that we wanted to avoid for the purposes of this class.

You can also always use the help() function from within this notebook:

import seaborn as sns

help(sns.relplot)

Getting Started

We import seaborn with the conventional alias sns, along with polars for data manipulation:

Code

import seaborn as sns
import polars as pl
from polars import col

Seaborn includes several built-in datasets we can use to learn. Let’s load the famous penguins dataset - body measurements for three penguin species from three islands in Antarctica:

Code

# Load dataset and convert to polars DataFrame
penguins = pl.DataFrame(sns.load_dataset("penguins"))
penguins.head()

shape: (5, 7)

species	island	bill_length_mm	bill_depth_mm	flipper_length_mm	body_mass_g	sex
str	str	f64	f64	f64	f64	str
"Adelie"	"Torgersen"	39.1	18.7	181.0	3750.0	"Male"
"Adelie"	"Torgersen"	39.5	17.4	186.0	3800.0	"Female"
"Adelie"	"Torgersen"	40.3	18.0	195.0	3250.0	"Female"
"Adelie"	"Torgersen"	null	null	null	null	null
"Adelie"	"Torgersen"	36.7	19.3	193.0	3450.0	"Female"

Painpoint: seaborn really prefers pandas not polars

Most seaborn functions are supposed to work with polars (our preferred) DataFrame library out-of-the-box. However, there are still a few rough-edges here and there.

So we highly recommend using a DataFrame’s .to_pandas() method, when you are calling a seaborn function instead of using the DataFrame directly. All the examples throughout this notebook use that pattern to remind you.

It’s also easy to convert the otherway by just passing a pandas DataFrame to pl.DataFrame():

# polars -> pandas for seaborn plotting
mydf.to_pandas()

# pandas -> polars, if ever needed, e.g. loading built-in seaborn datasets
mydf = pl.DataFrame(pandas_df)

The `seaborn` mental model

Similar to ggplot in R, you build figures by mapping column names to aesthetic properties

Aesthetic Mappings

Mapping	What it controls
`x`	Position on x-axis
`y`	Position on y-axis
`hue`	Color of points/bars/lines
`col`	Create columns of subplots
`row`	Create rows of subplots
`style`	Marker style (scatter) or line style
`size`	Size of points

Main functions

You an mix and match how you use seaborn between two kinds of approaches:

(recommended) Use one of the 3 main figure-level functions with the kind argument to pick a type
Use any of the axis-level functions directly if you don’t need faceting (but you’ll probably need to interact with matplotlib)

Function	Purpose	Plot Types
`sns.relplot()`	Relationships between numeric variables	`sns.scatter`, `sns.line`
`sns.displot()`	Distributions of variables	`sns.hist`, `sns.kde`, `sns.ecdf`
`sns.catplot()`	Categorical comparisons	`sns.strip`, `sns.box`, `sns.violin`, `sns.bar`, `sns.point`
`sns.lmplot()` (not pictured above)	Regression models	`sns.regplot` (not pictured above)

Each function accepts the same mappings (along with some data) using and returns a FacetGrid - a container that can hold one or more subplots.

Visualizing Relationships: `relplot()`

Use sns.relplot() when you want to see how numeric variables relate to each other. Let’s explore the relationship between flipper length and body mass in penguins:

Note

Remember Python cares about white-space and indentation except between (). So like the previous notebook, we’re using new lines to separate inputs to each seaborn function just for clarity. You can put them on a single line and it works the same

Code

sns.relplot(
    data=penguins.to_pandas(), # <- remember .to_pandas() to convert on-the-fly, no need for another variable
    x="flipper_length_mm",
    y="body_mass_g"
)

There’s a clear positive relationship - penguins with longer flippers tend to be heavier.

But wait - are there differences by species? Let’s map hue to the species column:

Code

sns.relplot(
    data=penguins.to_pandas(),
    x="flipper_length_mm",
    y="body_mass_g",
    hue="species"
)

Now we can see that Gentoo penguins (green) are generally larger than Adelie (blue) and Chinstrap (orange).

What if we want separate plots for each species? Map col to species:

Code

sns.relplot(
    data=penguins.to_pandas(),
    x="flipper_length_mm",
    y="body_mass_g",
    hue="species",
    col="species"
)

Controlling Layout

You can control the size and arrangement of subplots: - height: Height of each subplot (in inches) - aspect: Width-to-height ratio - col_wrap: Maximum number of columns before wrapping

Code

sns.relplot(
    data=penguins.to_pandas(),
    x="flipper_length_mm",
    y="body_mass_g",
    col="species",
    # we left out hue so species is only mapped to column
    height=3,
    aspect=1.2
)

Adding Regression Lines with `lmplot()`

When you want to see the trend in a relationship, use sns.lmplot(). It works just like relplot() but adds a regression line with an optional confidence band:

Code

sns.lmplot(
    data=penguins.to_pandas(),
    x="flipper_length_mm",
    y="body_mass_g",
    col="species"
)

Confidence bands/bars

One of the strengths of seaborn (and over ggplot) is that it calculates confidence bands/intervals using bootstrap resampling by default. We’ll cover cover statistical uncertainty and bootstrapping in more depth later in the course, but this is really nice for quick visuals without having to calculate additional statistics yourself, even if when working with small samples.

You can control this using the numerous additional arguments (inputs) that many seaborn functions support. Here are some common ones and the lmplot help page:

n_boot (1000 by default)
units (None by default - great when you have repeated observations per cluster/participant/etc)
errrorbar/ ci (lmplot only) (95 by default)

Your Turn

Create a scatter plot showing the relationship between bill_length_mm and bill_depth_mm, colored by species.

What pattern do you notice? Does the overall trend match the within-species trends?

Code

# Your code here

Visualizing Distributions: `displot()`

Use sns.displot() when you want to understand the distribution of a variable - how values are spread out, where they cluster, whether there are outliers.

The kind parameter controls the type of distribution plot: - "hist": Histogram (default) - "kde": Kernel density estimate (smoothed histogram) - "ecdf": Empirical cumulative distribution function

Histograms

Let’s look at the distribution of body mass:

Code

sns.displot(
    data=penguins.to_pandas(),
    x="body_mass_g",
    kind="hist"
)

Hmm, this distribution doesn’t look unimodal - there might me more than one peak. Let’s try increasing the granularity by increasing the number of automatically calculated bins

Code

sns.displot(
    data=penguins.to_pandas(),
    x="body_mass_g",
    kind="hist",
    bins=30
)

Ah we can see it a bit better now, but it’s probably driven the fact that we’re ignoring difference species

Code

sns.displot(
    data=penguins.to_pandas(),
    x="body_mass_g",
    kind="hist",
    hue="species",
    bins=30
)

The bimodality comes from Gentoo penguins being much heavier than the other two species.

When histograms overlap, you can use multiple="stack" to stack them:

Code

sns.displot(
    data=penguins.to_pandas(),
    x="body_mass_g",
    kind="hist",
    hue="species",
    bins=30,
    multiple="stack"
)

KDE Plots

Kernel Density Estimation (KDE) creates a smoothed version of a histogram. It’s often easier to compare shapes:

Code

sns.displot(
    data=penguins.to_pandas(),
    x="body_mass_g",
    kind="kde",
    hue="species",
    fill=True  # Fill under the curves
)

Choosing bins and smoothing

Histograms have bins (bins or binwidth parameters), KDE has smoothing (bw_adjust parameter).

Too few bins / too much smoothing can hide patterns. Too many bins / too little smoothing creates noise.

Always try a few values to make sure you’re not missing something important.

Bivariate Distributions

You can visualize the joint distribution of two variables by specifying both x and y:

Code

sns.displot(
    data=penguins.to_pandas(),
    x="flipper_length_mm",
    y="body_mass_g",
    kind="kde",
    hue="species"
)

The contour lines show regions of equal density - like a topographic map of where the data concentrates.

Your Turn

Create a histogram of bill_length_mm split by sex using col="sex".

What do you notice about the distributions?

Code

# Your code here

Visualizing Categories: `catplot()`

Use sns.catplot() when one of your variables is categorical (like species, sex, or experimental condition).

The kind parameter offers several options:

Show individual observations: - "strip": Jittered points - "swarm": Points arranged to avoid overlap

Show distributions: - "box": Box plots (median, quartiles, outliers) - "violin": Violin plots (KDE + box plot hybrid)

Show summaries: - "bar": Mean with error bars - "point": Mean with error bars as points/lines

Showing Raw Data: Strip and Swarm Plots

Strip plots show every data point, jittered horizontally to reduce overlap:

Code

sns.catplot(
    data=penguins.to_pandas(),
    x="species",
    y="body_mass_g",
    kind="strip"
)

Swarm plots arrange points so they don’t overlap, making the distribution shape visible:

Code

sns.catplot(
    data=penguins.to_pandas(),
    x="species",
    y="body_mass_g",
    kind="swarm"
)

Add hue to compare subgroups within each category:

Code

sns.catplot(
    data=penguins.to_pandas(),
    x="species",
    y="body_mass_g",
    hue="sex",
    kind="swarm"
)

Showing Distributions: Box and Violin Plots

Box plots summarize the distribution with quartiles:

Code

sns.catplot(
    data=penguins.to_pandas(),
    x="species",
    y="body_mass_g",
    hue="sex",
    kind="box"
)

Reading a box plot

Box: Middle 50% of data (25th to 75th percentile) Line in box: Median (50th percentile) Whiskers: Extend to 1.5x the box width Diamonds: Outliers beyond the whiskers

Violin plots combine a box plot with a KDE, showing the full distribution shape:

Code

sns.catplot(
    data=penguins.to_pandas(),
    x="species",
    y="body_mass_g",
    hue="sex",
    kind="violin"
)

Showing Summaries: Bar and Point Plots

By default, bar plots show the mean of each group with error bars (95% CI by default):

Code

sns.catplot(
    data=penguins.to_pandas(),
    x="species",
    y="body_mass_g",
    hue="sex",
    kind="bar"
)

We can change this by passing in a different estimator either as a string that seaborn understands, or a custom function

Code

sns.catplot(
    data=penguins.to_pandas(),
    x="species",
    y="body_mass_g",
    hue="sex",
    kind="bar",
    estimator="median" # <- using the median
)

Point plots are similar but use points and lines, which can be better for comparing groups within categories:

Code

sns.catplot(
    data=penguins.to_pandas(),
    x="sex",
    y="body_mass_g",
    hue="species",
    kind="point",
    estimator="min", # <- using the minimum
    errorbar=None,  # <- no uncertainty calculations
)

Layering Plots

Show all data when you can

Remember our first statistical principle: aggregation Bar and point plots show only the summary statistic and uncertainty. They don’t tell you much about the actual range and shape (distribution) of the data.

To help layer multiple plots onto the same FacetGrid you can use the .map_dataframe() method that the FacetGrid has:

Code

grid = sns.catplot(
    data=penguins.to_pandas(),
    x="species",
    y="body_mass_g",
    hue="sex",
    kind="bar",
)

# Saving the output of catplot (FacetGrid) to a variable
# makes it easy to get help on its *method*
help(grid.map_dataframe)

Help on method map_dataframe in module seaborn.axisgrid:

map_dataframe(func, *args, **kwargs) method of seaborn.axisgrid.FacetGrid instance
    Like ``.map`` but passes args as strings and inserts data in kwargs.
    
    This method is suitable for plotting with functions that accept a
    long-form DataFrame as a `data` keyword argument and access the
    data in that DataFrame using string variable names.
    
    Parameters
    ----------
    func : callable
        A plotting function that takes data and keyword arguments. Unlike
        the `map` method, a function used here must "understand" Pandas
        objects. It also must plot to the currently active matplotlib Axes
        and take a `color` keyword argument. If faceting on the `hue`
        dimension, it must also take a `label` keyword argument.
    args : strings
        Column names in self.data that identify variables with data to
        plot. The data for each variable is passed to `func` in the
        order the variables are specified in the call.
    kwargs : keyword arguments
        All keyword arguments are passed to the plotting function.
    
    Returns
    -------
    self : object
        Returns self.

Let’s use the method. We can pass in sns.striplot as the function and then provide our mappings:

Code

grid.map_dataframe(
    sns.stripplot,
    x="species",
    y="body_mass_g",
    hue="sex",
    dodge=True,
    alpha=0.5
)

Your Turn

Create a box plot of flipper_length_mm by island, with separate subplots (col) for each species.

Are there island differences within species?

Try layering the data on top and trying to understand how it works

Code

# Your code here

Multi-Panel Views

Seaborn has two special functions for getting a quick overview of your data:

sns.jointplot(): One relationship with marginal distributions
sns.pairplot(): All pairwise relationships at once

Joint Plots

jointplot() shows the relationship between two variables AND their individual distributions:

Code

sns.jointplot(
    data=penguins.to_pandas(),
    x="flipper_length_mm",
    y="body_mass_g",
    hue="species"
)

Pair Plots

pairplot() creates a matrix of scatter plots for all pairs of numeric variables:

Code

# pairplot needs pandas, so we convert
sns.pairplot(
    data=penguins.to_pandas(),
    hue="species"
)

The diagonal shows distributions for each variable. Off-diagonal shows scatter plots between pairs.

This is a great way to quickly explore relationships in a new dataset!

Customizing Your Plots

Seaborn provides several ways to customize plots without diving into matplotlib.

Themes and Contexts

Themes control the overall style (background, grid lines, etc.): - "darkgrid" (default), "whitegrid", "dark", "white", "ticks"

Contexts control the scale (good for different output sizes): - "paper", "notebook" (default), "talk", "poster"

Use sns.set_theme() to change both for all subsequent plots:

Code

# Set a clean theme with larger text
sns.set_theme(style="whitegrid", context="talk")

sns.relplot(
    data=penguins.to_pandas(),
    x="flipper_length_mm",
    y="body_mass_g",
    hue="species"
)

Code

# Reset to defaults
sns.set_theme(style="darkgrid", context="notebook")

Color Palettes

Seaborn has many built-in color palettes. Pass a palette name to the palette parameter:

Code

sns.catplot(
    data=penguins.to_pandas(),
    x="species",
    y="body_mass_g",
    hue="species",
    kind="box",
    palette="Set2"
)

Common palettes: - Categorical: "Set1", "Set2", "Paired", "tab10" - Sequential: "Blues", "Greens", "viridis", "rocket" - Diverging: "coolwarm", "RdBu", "vlag"

FacetGrid Methods

The FacetGrid returned by figure-level functions has methods for customization:

Method	Purpose
`.set_axis_labels(x, y)`	Change axis labels
`.set_titles(template)`	Change subplot titles
`.set(xlim=, ylim=)`	Set axis limits
`.tight_layout()`	Adjust spacing
`.figure.suptitle()`	Add overall title

Code

g = sns.relplot(
    data=penguins.to_pandas(),
    x="flipper_length_mm",
    y="body_mass_g",
    col="species",
    height=3
)

# Customize using FacetGrid methods
g.set_axis_labels("Flipper Length (mm)", "Body Mass (g)")
g.set_titles("{col_name}")
g.figure.suptitle("Penguin Body Measurements", y=1.02)

Text(0.5, 1.02, 'Penguin Body Measurements')

Controlling Order

By default, seaborn orders categorical variables as they appear in the data. Use these parameters to control order:

order: Order for the x-axis variable
hue_order: Order for hue categories
col_order, row_order: Order for subplot columns/rows

Code

sns.catplot(
    data=penguins.to_pandas(),
    x="species",
    y="body_mass_g",
    kind="bar",
    order=["Chinstrap", "Adelie", "Gentoo"]  # Custom order
)

Your Turn

Create a violin plot of bill_length_mm by species, with: - A “whitegrid” theme - The “pastel” color palette - Species ordered alphabetically - Axis labels “Species” and “Bill Length (mm)”

Code

# Your code here

Pro Tips

These are common customization tasks that aren’t obvious from the basic API. Bookmark this section for future reference!

Moving or Removing Legends

Seaborn places legends automatically, but you often want them elsewhere. Use sns.move_legend():

Code

myplot = sns.relplot(
    data=penguins.to_pandas(),
    x="flipper_length_mm",
    y="body_mass_g",
    hue="species",
    height=4
)

# Move legend outside the plot
sns.move_legend(myplot, "upper left", bbox_to_anchor=(1, 1))

# Since we saved the plot to a variable we need to type the variable name to see it
myplot

To remove a legend entirely, set legend=False in the plotting function:

Code

sns.relplot(
    data=penguins.to_pandas(),
    x="flipper_length_mm",
    y="body_mass_g",
    hue="species",
    legend=False,  # No legend
    height=4
)

Rotating Tick Labels

Long category names often overlap. Use .set_xticklabels() with rotation:

Code

rotated = sns.catplot(
    data=penguins.to_pandas(),
    x="island",
    y="body_mass_g",
    hue="species",
    kind="bar"
)

# Rotate x-axis labels
rotated.set_xticklabels(rotation=45, ha="right")  # ha = horizontal alignment

# Show it
rotated

Cleaner Figures with `despine()`

Remove the top and right borders (“spines”) for a cleaner look:

Code

nospine = sns.relplot(
    data=penguins,
    x="flipper_length_mm",
    y="body_mass_g",
    height=4
)

# Remove top and right spines
sns.despine()

# Show it
nospine

Saving Publication-Quality Figures

Use .savefig() with these key parameters: - dpi=300 (or higher) for print quality - bbox_inches='tight' to prevent labels being cut off - Vector formats (PDF, SVG) for publications

Code

saveme = sns.relplot(
    data=penguins,
    x="flipper_length_mm",
    y="body_mass_g",
    hue="species",
    height=4
)
# Update axis labels
saveme.set_axis_labels("Flipper Length (mm)", "Body Mass (g)")
# Despine
sns.despine()

# Save with high quality settings
saveme.savefig("my_figure.png", dpi=300, bbox_inches="tight")
saveme.savefig("my_figure.pdf", bbox_inches="tight")  # Vector format

# You should see them in the file explorer to the left!

Pro Tip: Publication Template

For consistent publication-quality figures, set your theme once at the start of your script:

sns.set_theme(
    style="ticks",
    context="paper",
    font_scale=1.0,
    rc={"savefig.dpi": 300}
)

Then use sns.despine() after each plot and save with bbox_inches="tight".

Putting It Together

Let’s recreate a classic visualization of Anscombe’s Quartet - four datasets with identical summary statistics but very different distributions:

First we’ll load the data and take a quick look

Code

# Load Anscombe's quartet
anscombe = pl.DataFrame(sns.load_dataset("anscombe"))
anscombe.head()

shape: (5, 3)

dataset	x	y
str	f64	f64
"I"	10.0	8.04
"I"	8.0	6.95
"I"	13.0	7.58
"I"	9.0	8.81
"I"	11.0	8.33

Now we aggregate by dataset and confirm they have the same means and correlations:

Code

anscombe.group_by("dataset").agg(
    x_mean=col("x").mean(),
    y_mean=col("y").mean().round(2),
    correlation=pl.corr("x", "y").round(2) # handy to get correlations btwn cols
)

shape: (4, 4)

dataset	x_mean	y_mean	correlation
str	f64	f64	f64
"II"	9.0	7.5	0.82
"IV"	9.0	7.5	0.82
"III"	9.0	7.5	0.82
"I"	9.0	7.5	0.82

Now let’s create a nice figure trying to replicate the styles above as closely as possible. We’ll:

Set the theme and create the base plot with lmplot()
Define custom tick ranges using list() and range() with slicing
Use FacetGrid methods to customize limits, labels, titles, and spacing

Code

# Set the theme and font
sns.set_theme(style="white", font="Avenir")

# Make the plot
ansplot = sns.lmplot(
    data=anscombe,
    x="x",
    y="y",
    col="dataset",
    col_wrap=2,
    height=2,
    aspect=1.2,
    truncate=False, # dont restrict regression line to data range
    ci=None, # lmplot uses ci, but everything else uses errorbar
    line_kws={"color": "steelblue"}, # regression line
    scatter_kws={"edgecolors": "sienna", "color": "darkorange"} # points
)

# Define custom tick ranges (skip 0)
xticks = list(range(0, 25, 5))[1:]
yticks = list(range(0, 16, 4))[1:]

# Set the limits and ticks
ansplot.set(
    xlim=(0, 25),
    ylim=(0, 16),
    xticks=xticks,
    yticks=yticks,
)

# Hide ticks and adjust label font-size
ansplot.tick_params(length=0, labelsize=8)

# Remove individual sub-plot labels
ansplot.set_axis_labels("","")

# Adjust subplot titles
ansplot.set_titles("{col_name}", size=10)

# Add an overall title
ansplot.figure.suptitle("Anscombe's Quartet", fontsize=12)

# Auto-adjust title & label spacing to not overlap
ansplot.figure.tight_layout()

# Show it
ansplot

Text(0.5, 0.98, "Anscombe's Quartet")

Appendix

Quick Reference

Figure-level functions:

sns.relplot(data, x, y, hue, col, row, kind="scatter"|"line")
sns.displot(data, x, y, hue, col, row, kind="hist"|"kde"|"ecdf")
sns.catplot(data, x, y, hue, col, row, kind="strip"|"swarm"|"box"|"violin"|"bar"|"point")
sns.lmplot(data, x, y, hue, col, row)  # scatter + regression

Quick overview functions:

sns.jointplot(data, x, y, hue)   # one relationship + marginals
sns.pairplot(data, hue)          # all pairwise relationships

Customization:

sns.set_theme(style="...", context="...", palette="...")
g.set_axis_labels("x label", "y label")
g.set_titles("{col_name}")
g.set(xlim=(a, b), ylim=(c, d))
g.figure.suptitle("Overall title")

For `ggplot` experts

If you’re coming from R’s ggplot2, this section maps familiar concepts to their seaborn equivalents. Use the tabs to toggle between the two syntaxes.

Aesthetic Mappings

ggplot(df, aes(x = var1, y = var2, color = group))

Mapping	`aes()` parameter
Position	`x`, `y`
Color	`color` / `colour`
Fill	`fill`
Shape	`shape`
Size	`size`
Transparency	`alpha`

sns.scatterplot(data=df, x="var1", y="var2", hue="group")

Mapping	Parameter
Position	`x=`, `y=`
Color	`hue=`
Fill	`hue=` (context-dependent)
Shape	`style=`
Size	`size=`
Transparency	`alpha=`

Plot Types

Task	Function
Scatter	`geom_point()`
Line	`geom_line()`
Bar (counts)	`geom_bar()`
Bar (values)	`geom_col()`
Histogram	`geom_histogram()`
Density	`geom_density()`
Boxplot	`geom_boxplot()`
Violin	`geom_violin()`
Regression	`geom_smooth()`

Task	Function
Scatter	`sns.scatterplot()` or `relplot(kind="scatter")`
Line	`sns.lineplot()` or `relplot(kind="line")`
Bar (counts)	`sns.countplot()`
Bar (values)	`sns.barplot()`
Histogram	`sns.histplot()` or `displot(kind="hist")`
Density	`sns.kdeplot()` or `displot(kind="kde")`
Boxplot	`sns.boxplot()` or `catplot(kind="box")`
Violin	`sns.violinplot()` or `catplot(kind="violin")`
Regression	`sns.lmplot()` or `sns.regplot()`

Faceting

ggplot(df, aes(x, y)) +
  geom_point() +
  facet_wrap(~group)

Faceting	Syntax
Wrap by one variable	`facet_wrap(~var)`
Wrap with column limit	`facet_wrap(~var, ncol=2)`
Grid by two variables	`facet_grid(row ~ col)`

sns.relplot(data=df, x="x", y="y", col="group")

Faceting	Syntax
Wrap by one variable	`col="var"`
Wrap with column limit	`col="var", col_wrap=2`
Grid by two variables	`row="row", col="col"`

Theming

ggplot(df, aes(x, y)) +
  geom_point() +
  theme_minimal()

Theme	Description
`theme_gray()`	Gray background with grid
`theme_bw()`	White background with grid
`theme_minimal()`	Minimal, clean
`theme_classic()`	Classic with axis lines

Output scaling: Set in ggsave(width=, height=, dpi=)

sns.set_theme(style="whitegrid")
sns.relplot(data=df, x="x", y="y")

Style	Description
`"darkgrid"`	Gray background with grid
`"whitegrid"`	White background with grid
`"white"`	Minimal, clean
`"ticks"`	Classic with axis ticks

Output scaling: Use context= parameter

"paper" — smallest
"notebook" — default
"talk" — larger for presentations
"poster" — largest

Labels & Titles

ggplot(df, aes(x, y)) +
  geom_point() +
  labs(
    title = "My Title",
    x = "X Label",
    y = "Y Label"
  )

Figure-level functions (relplot, displot, catplot):

g = sns.relplot(data=df, x="x", y="y")
g.set_axis_labels("X Label", "Y Label")
g.figure.suptitle("My Title", y=1.02)

Axes-level functions (scatterplot, histplot, etc.):

ax = sns.scatterplot(data=df, x="x", y="y")
ax.set(xlabel="X Label", ylabel="Y Label", title="My Title")

Additional Resources

Seaborn Tutorial - Official comprehensive guide
Seaborn Example Gallery - Visual examples with code
Seaborn API Reference - Complete function documentation
Color Palette Guide - Choosing colors

How to use this notebook

Getting Started

The seaborn mental model

Aesthetic Mappings

Main functions

Visualizing Relationships: relplot()

Controlling Layout

Adding Regression Lines with lmplot()

Visualizing Distributions: displot()

Histograms

KDE Plots

Bivariate Distributions

Visualizing Categories: catplot()

Showing Raw Data: Strip and Swarm Plots

Showing Distributions: Box and Violin Plots

Showing Summaries: Bar and Point Plots

Layering Plots

Multi-Panel Views

Joint Plots

Pair Plots

Customizing Your Plots

Themes and Contexts

Color Palettes

FacetGrid Methods

Controlling Order

Pro Tips

Moving or Removing Legends

Rotating Tick Labels

Cleaner Figures with despine()

Saving Publication-Quality Figures

Putting It Together

Appendix

Quick Reference

For ggplot experts

Aesthetic Mappings

Plot Types

Faceting

Theming

Labels & Titles

Additional Resources

The `seaborn` mental model

Visualizing Relationships: `relplot()`

Adding Regression Lines with `lmplot()`

Visualizing Distributions: `displot()`

Visualizing Categories: `catplot()`

Cleaner Figures with `despine()`

For `ggplot` experts