What is Seaborn?

Seaborn is a data visualization library based on Matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics. While Matplotlib is powerful, it can be quite verbose. Seaborn makes it easier to:

  • Create complex plots like multi-panel categorical plots, violin plots, and heatmaps.
  • Work directly with Pandas DataFrames.
  • Apply beautiful default styles and color palettes.

Think of Seaborn as a tool that automates many of the common customization steps you would manually perform in Matplotlib.

Effortless Styling with Themes

One of Seaborn's most beloved features is its ability to instantly improve the aesthetics of your plots. Just call sns.set_theme() at the beginning of your script.

Python


import seaborn as sns
import matplotlib.pyplot as plt

# Apply a global theme: 'darkgrid', 'whitegrid', 'dark', 'white', 'ticks'
sns.set_theme(style="whitegrid", palette="viridis")

A Full Example: Exploring Restaurant Tips

Let's use Seaborn's built-in tips dataset to quickly create several common statistical plots. This dataset contains information about tips given in a restaurant.

Python


import seaborn as sns
import matplotlib.pyplot as plt

# Load one of Seaborn's example datasets
tips_df = sns.load_dataset("tips")

# --- 1. Inspect the Data ---
print("First 5 rows of the tips dataset:")
print(tips_df.head())

# --- 2. Distribution Plot: How are total bills distributed? ---
# A histogram with a Kernel Density Estimate (KDE) line.
plt.figure(figsize=(8, 6)) # Create a matplotlib figure to control size
sns.histplot(data=tips_df, x="total_bill", kde=True)
plt.title('Distribution of Total Bill Amounts')
plt.xlabel('Total Bill ($)')
plt.ylabel('Count')
plt.show()

# --- 3. Relational Plot: Is there a relationship between the bill and the tip? ---
# A scatter plot where points are colored by another category (e.g., 'smoker').
plt.figure(figsize=(8, 6))
sns.scatterplot(
    data=tips_df,
    x="total_bill",
    y="tip",
    hue="smoker",   # Color points based on the 'smoker' column
    style="time",   # Change marker style based on the 'time' column
    size="size"     # Change marker size based on the party 'size'
)
plt.title('Total Bill vs. Tip Amount')
plt.xlabel('Total Bill ($)')
plt.ylabel('Tip ($)')
plt.show()

# --- 4. Categorical Plot: How do tips differ between days of the week? ---
# A box plot is excellent for comparing distributions across categories.
plt.figure(figsize=(8, 6))
sns.boxplot(data=tips_df, x="day", y="tip", order=["Thur", "Fri", "Sat", "Sun"])
plt.title('Distribution of Tips by Day of the Week')
plt.xlabel('Day of the Week')
plt.ylabel('Tip ($)')
plt.show()

# --- 5. Matrix Plot: What are the correlations between numerical variables? ---
# A heatmap is perfect for visualizing a correlation matrix.
numeric_cols = tips_df.select_dtypes(include=['float64', 'int64']).columns
correlation_matrix = tips_df[numeric_cols].corr()

plt.figure(figsize=(8, 6))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt=".2f")
plt.title('Correlation Matrix of Numerical Features')
plt.show()

In this example, we created four distinct, publication-quality statistical plots with very little code. Seaborn handled the complex mappings of data variables to visual properties (like hue, style, and size) and applied an attractive theme automatically.