Data Visualization with Matplotlib and Plotly
Summary
This chapter teaches students how to create effective data visualizations using matplotlib, Seaborn, and the modern interactive library Plotly. Students will learn visualization architecture (figures and axes), create various plot types (line, scatter, bar, histogram, box, pie), and customize their visualizations with titles, labels, legends, and colors. The chapter emphasizes choosing appropriate visualizations for different data types and creating interactive charts that engage viewers. By the end of this chapter, students will be able to create publication-quality visualizations that effectively communicate insights from data.
Concepts Covered
This chapter covers the following 25 concepts from the learning graph:
- Data Visualization
- Matplotlib Library
- Figure
- Axes
- Plot Function
- Line Plot
- Scatter Plot
- Bar Chart
- Histogram
- Box Plot
- Pie Chart
- Subplot
- Figure Size
- Title
- Axis Labels
- Legend
- Color
- Markers
- Line Styles
- Grid
- Annotations
- Save Figure
- Plot Customization
- Seaborn Library
- Statistical Plots
Prerequisites
This chapter builds on concepts from:
Show, Don't Tell: The Visual Superpower
You've loaded data. You've cleaned it. You've wrangled it into perfect shape. But here's the thing—a spreadsheet full of numbers is about as exciting as reading the phone book. Nobody ever changed the world by emailing a CSV file.
Data visualization is where data science becomes VISIBLE. It's the superpower that lets you take thousands of numbers and transform them into a single image that tells a story. A well-crafted chart can reveal patterns that would take hours to find in a table. It can convince skeptics, inspire action, and make the invisible visible.
Think about it: every powerful presentation you've ever seen probably had a chart. Every news story about trends shows a graph. Every scientific breakthrough gets communicated through visualization. This is the skill that takes your analysis from "interesting to me" to "interesting to everyone."
In this chapter, you'll learn multiple visualization tools—from the classic Matplotlib to the beautiful Seaborn to the modern, interactive Plotly. By the end, you'll be creating charts that don't just display data—they ENGAGE with it.
Diagram: Visualization Library Comparison
Python Visualization Library Landscape
Type: infographic
Bloom Taxonomy: Understand (L2)
Learning Objective: Help students understand when to use different visualization libraries
Purpose: Compare the major Python visualization libraries and their strengths
Layout: Three-column comparison card layout
Column 1: MATPLOTLIB - Icon: Classic line graph - Tagline: "The Foundation" - Color: Blue - Strengths: - Complete control over every element - Publication-quality static images - Huge community and documentation - Works everywhere - Best for: - Academic papers - Print publications - Maximum customization - Learning curve: Medium-High - Interactivity: Limited (static by default)
Column 2: SEABORN - Icon: Statistical plot with confidence intervals - Tagline: "Beautiful Statistics" - Color: Teal - Strengths: - Beautiful defaults - Built-in statistical visualizations - Works with pandas DataFrames - Less code for common plots - Best for: - Statistical analysis - Exploratory data analysis - Quick beautiful plots - Learning curve: Low-Medium - Interactivity: Limited (built on matplotlib)
Column 3: PLOTLY - Icon: Interactive 3D scatter plot with cursor - Tagline: "Interactive & Modern" - Color: Purple - Strengths: - Interactive by default (zoom, pan, hover) - Web-ready (HTML output) - Beautiful modern aesthetics - 3D visualizations - Dashboards with Dash - Best for: - Web applications - Presentations - Data exploration - User engagement - Learning curve: Low-Medium - Interactivity: Full (native)
Bottom section: Decision flowchart - "Need print/PDF?" → Matplotlib - "Statistical focus?" → Seaborn - "Need interactivity?" → Plotly - "Quick exploration?" → Seaborn or Plotly
Interactive elements: - Hover over each library to see code examples - Click to see sample output images
Implementation: HTML/CSS grid with JavaScript hover effects
The Classic: Matplotlib Library
Let's start with the grandfather of Python visualization: the Matplotlib library. Created in 2003, matplotlib is the foundation that most other Python visualization libraries build upon. It's powerful, flexible, and gives you complete control over every pixel.
1 2 3 4 | |
Understanding Figures and Axes
Matplotlib has a specific architecture you need to understand. A figure is the entire window or page—think of it as your canvas. Axes are the actual plots within that figure (yes, the name is confusing—it's not about x-axis and y-axis, but the plot area itself).
1 2 3 4 5 6 7 8 9 | |
This figure/axes separation becomes important when you create multiple plots. You can have one figure with many axes (subplots), giving you complete control over complex layouts.
1 2 3 4 5 6 7 8 9 10 11 | |
The Plot Function
The plot function is your basic drawing tool. At its simplest, it connects points with lines:
1 2 3 4 5 6 7 8 | |
But plot() can do much more with its many parameters for line styles, markers, and colors.
| Parameter | Example | Description |
|---|---|---|
color or c |
'red', '#FF5733', 'C0' |
Line/marker color |
linestyle or ls |
'-', '--', ':', '-.' |
Line pattern |
linewidth or lw |
2, 0.5 |
Line thickness |
marker |
'o', 's', '^', '*' |
Point markers |
markersize or ms |
10, 5 |
Marker size |
1 2 3 4 5 6 7 8 | |
The Format String Shortcut
Matplotlib has a shortcut: plt.plot(x, y, 'ro--') means red (r), circles (o), dashed line (--). It's compact but can be cryptic—use named parameters for clarity in your code.
Essential Plot Types
Different data calls for different visualizations. Let's master the essential types.
Line Plot: Trends Over Time
A line plot connects data points with lines, perfect for showing how values change over time or across a sequence.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | |
When to use: Time series, trends, continuous data, showing progression.
Scatter Plot: Relationships Between Variables
A scatter plot shows individual data points without connecting them, revealing relationships between two variables.
1 2 3 4 5 6 7 8 9 10 | |
When to use: Correlation analysis, comparing two numeric variables, finding clusters or outliers.
Bar Chart: Comparing Categories
A bar chart uses rectangular bars to compare values across categories.
1 2 3 4 5 6 7 8 9 10 | |
When to use: Comparing categories, showing rankings, discrete data.
Histogram: Distribution of Values
A histogram shows how values are distributed across ranges (bins). Unlike bar charts, histograms show continuous data grouped into intervals.
1 2 3 4 5 6 7 8 9 10 11 12 13 | |
When to use: Understanding distribution shape, finding outliers, comparing to normal distribution.
Box Plot: Statistical Summary
A box plot (or box-and-whisker plot) shows the five-number summary: minimum, first quartile (Q1), median, third quartile (Q3), and maximum. It's excellent for comparing distributions and spotting outliers.
1 2 3 4 5 6 7 8 9 10 | |
When to use: Comparing distributions, identifying outliers, showing spread and central tendency.
Pie Chart: Parts of a Whole
A pie chart shows proportions of a whole. Use them sparingly—they're often harder to read than bar charts.
1 2 3 4 5 6 7 8 9 | |
When to use: Showing proportions of a whole (when you have 2-5 categories). Avoid for comparisons.
Diagram: Chart Type Selection Guide
Which Chart Should I Use?
Type: infographic
Bloom Taxonomy: Apply (L3)
Learning Objective: Help students choose the appropriate chart type for their data and question
Purpose: Decision guide for selecting visualization types
Layout: Flowchart/decision tree with visual examples
Starting question: "What do you want to show?"
Branch 1: "Comparison" - Few categories → Bar Chart (vertical) - Many categories → Bar Chart (horizontal) - Over time → Line Chart (multiple lines) - Visual: Small example of each
Branch 2: "Distribution" - Single variable → Histogram - Compare distributions → Box Plot - Density estimate → KDE Plot - Visual: Small example of each
Branch 3: "Relationship" - Two variables → Scatter Plot - Three variables → Bubble Chart (size = 3rd var) - Many variables → Pair Plot - Visual: Small example of each
Branch 4: "Composition" - Static → Pie Chart (2-5 parts only!) - Over time → Stacked Area Chart - Many parts → Treemap - Visual: Small example of each
Branch 5: "Trend" - Over time → Line Chart - With uncertainty → Line + Confidence Band - Multiple series → Multiple Lines + Legend - Visual: Small example of each
Warning callouts: - "Pie charts: Only use with 2-5 categories" - "3D charts: Avoid! They distort perception" - "Dual y-axes: Use carefully, can mislead"
Interactive elements: - Hover over each chart type to see larger example - Click to see code snippet
Visual style: Clean flowchart with colorful chart thumbnails
Implementation: SVG with interactive JavaScript
Customizing Your Visualizations
Raw plots are just the beginning. Professional visualizations need polish. Let's master plot customization.
Title and Axis Labels
Every chart needs a title that explains what it shows and axis labels that explain the variables:
1 2 3 4 5 6 7 8 9 10 11 | |
Legend
A legend identifies multiple data series. Position it where it doesn't obscure data:
1 2 3 4 5 6 7 8 | |
Grid and Annotations
A grid helps readers estimate values. Annotations highlight specific points:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
Figure Size and Saving
Control figure size for different outputs and save your work:
1 2 3 4 5 6 7 8 9 10 11 | |
| Format | Best For | File Size |
|---|---|---|
| PNG | Web, presentations | Medium |
| Publications, print | Small | |
| SVG | Web (scalable) | Small |
| JPG | Photos (avoid for charts) | Small |
Subplots: Multiple Views
Subplots let you show multiple related visualizations together:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | |
Sharing Axes
Use sharex=True or sharey=True in subplots() to align axes across plots—essential for fair comparisons.
Seaborn: Beautiful Statistics
The Seaborn library builds on matplotlib to provide beautiful default styles and specialized statistical plots. It's perfect for exploratory data analysis.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | |
Seaborn's statistical plots include:
sns.histplot()- Enhanced histograms with KDEsns.boxplot()- Box plots with category supportsns.violinplot()- Distribution shape visualizationsns.heatmap()- Correlation matricessns.pairplot()- All pairwise relationships
1 2 3 4 5 | |
Plotly: Interactive Visualization
Now for the exciting part! Plotly creates interactive visualizations that users can explore—zoom, pan, hover for details, and more. This is what modern data visualization looks like.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
When you run this, you get a chart where you can:
- Hover over points to see exact values
- Zoom by clicking and dragging
- Pan by holding shift and dragging
- Download as PNG with one click
- Toggle data series on/off via legend
Why Plotly Changes Everything
| Feature | Matplotlib | Plotly |
|---|---|---|
| Default output | Static image | Interactive HTML |
| Hover tooltips | Manual coding | Automatic |
| Zoom/Pan | Not available | Built-in |
| Web embedding | Export as image | Native HTML |
| Learning curve | Medium-High | Low-Medium |
| Customization | Maximum | High |
Plotly Express: The Fast Lane
plotly.express (imported as px) provides high-level functions for common chart types:
1 2 3 4 5 6 7 8 9 10 | |
Diagram: Plotly Interactive Features MicroSim
Interactive Chart Exploration Playground
Type: microsim
Bloom Taxonomy: Apply (L3)
Learning Objective: Let students experience and practice using Plotly's interactive features
Canvas layout (850x600px): - Main area (850x450): Interactive Plotly chart - Bottom panel (850x150): Feature buttons and instructions
Visual elements: - Sample scatter plot with 50+ data points - Multiple colored categories - Visible toolbar (zoom, pan, select, download) - Hover tooltip showing data values
Interactive features to demonstrate: 1. HOVER: Move mouse over points to see tooltips 2. ZOOM: Click-drag to zoom into a region 3. PAN: Shift+drag to pan around 4. BOX SELECT: Draw box to select points 5. LASSO SELECT: Freeform selection 6. RESET: Double-click to reset view 7. DOWNLOAD: Click camera icon to save PNG 8. LEGEND: Click legend items to toggle series
Challenge tasks (bottom panel): - "Zoom into the cluster in the upper right" - "Select all points in category A" - "Find the outlier with the highest y-value" - "Download the chart as PNG"
Progress tracker: - Checkboxes for each feature used - "You've explored X of 8 interactive features!"
Behavior: - Track which features student has used - Provide hints for unexplored features - Celebrate when all features discovered
Visual style: Modern dashboard aesthetic
Implementation: Embedded Plotly.js chart with custom tracking overlay
Interactive Line Charts
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | |
Interactive Bar Charts
1 2 3 4 5 6 7 8 9 10 11 | |
Interactive Scatter Plots
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
This creates the famous "Gapminder" visualization that Hans Rosling made famous—an animated bubble chart showing how countries develop over time!
Interactive Histograms and Box Plots
1 2 3 4 5 6 7 8 9 | |
1 2 3 4 5 6 | |
Customizing Plotly Charts
Plotly offers extensive customization through update_layout() and update_traces():
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | |
Subplots in Plotly
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
Saving Plotly Charts
1 2 3 4 5 6 7 | |
Diagram: Plotly Code Pattern Reference
Plotly Express Quick Reference Card
Type: infographic
Bloom Taxonomy: Remember (L1)
Learning Objective: Provide quick reference for common Plotly Express patterns
Purpose: Cheat sheet for Plotly Express functions and parameters
Layout: Four-quadrant reference card
Quadrant 1: "Common Chart Functions"
1 2 3 4 5 6 7 8 | |
Quadrant 2: "Essential Parameters"
1 2 3 4 5 6 7 8 9 | |
Quadrant 3: "Layout Customization"
1 2 3 4 5 6 7 8 9 | |
Quadrant 4: "Saving Options"
1 2 3 4 5 6 7 8 9 10 | |
Bottom strip: "Templates" - plotly, plotly_white, plotly_dark - ggplot2, seaborn, simple_white - Visual swatches of each
Color scheme: Purple gradient (Plotly brand color)
Interactive elements: - Hover for expanded code examples - Click to copy code snippet
Implementation: HTML/CSS grid with copy-to-clipboard JavaScript
Real-World Visualization Workflow
Let's put it all together with a complete workflow:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | |
Diagram: Visualization Design MicroSim
Chart Design Playground
Type: microsim
Bloom Taxonomy: Create (L6)
Learning Objective: Let students design and customize their own visualizations interactively
Canvas layout (900x650px): - Left panel (300x650): Controls and options - Right panel (600x650): Live chart preview
Control panel sections:
Section 1: "Chart Type" - Radio buttons: Line, Scatter, Bar, Histogram, Box - Visual icon for each type
Section 2: "Data Selection" - Dropdown: X-axis variable - Dropdown: Y-axis variable - Dropdown: Color by (optional) - Dropdown: Size by (optional)
Section 3: "Customization" - Text input: Title - Text input: X-axis label - Text input: Y-axis label - Color picker: Primary color - Dropdown: Color palette (categorical) - Slider: Marker size (5-50) - Slider: Line width (1-5) - Toggle: Show grid - Toggle: Show legend
Section 4: "Export" - Button: "Copy Code" - Button: "Download PNG" - Button: "Download HTML"
Sample dataset: - Pre-loaded "tips" style dataset - Columns: total_bill, tip, day, time, size, smoker
Chart preview: - Updates in real-time as controls change - Fully interactive (zoom, pan, hover) - Shows Plotly toolbar
Code panel (collapsible): - Shows Python code that would generate current chart - Updates dynamically with changes - Syntax highlighted
Behavior: - Every control change immediately updates preview - Code panel reflects exact current configuration - Copy code button copies to clipboard - Download buttons generate files
Educational features: - Tooltips explaining each option - "Design tips" suggestions based on data types selected - Warnings for bad practices (pie chart with too many categories, etc.)
Visual style: Modern design tool interface (think Canva/Figma)
Implementation: p5.js for controls + embedded Plotly.js for preview
Choosing the Right Visualization
The most important skill isn't knowing how to make a chart—it's knowing WHICH chart to make. Here's your decision framework:
| Your Question | Best Chart Type | Why |
|---|---|---|
| "How does X change over time?" | Line chart | Shows trends and patterns |
| "How are X and Y related?" | Scatter plot | Reveals correlations |
| "How do categories compare?" | Bar chart | Easy comparison |
| "What's the distribution?" | Histogram | Shows shape and spread |
| "How do groups compare statistically?" | Box plot | Shows median, quartiles, outliers |
| "What's the composition?" | Pie chart (2-5 parts) | Shows parts of whole |
| "How do multiple variables relate?" | Pair plot / Scatter matrix | See all relationships |
Visualization Pitfalls to Avoid
- Truncated axes: Starting y-axis at non-zero exaggerates differences
- 3D charts: They look cool but distort perception—avoid them
- Too many colors: Stick to 5-7 distinct colors maximum
- Missing labels: Every chart needs title, axis labels, and legend (if needed)
- Pie charts with many slices: More than 5 categories? Use a bar chart instead
Best Practices Summary
The Visualization Checklist
Before sharing any visualization, verify:
- [ ] Clear, descriptive title
- [ ] Labeled axes with units
- [ ] Legend (if multiple series)
- [ ] Appropriate chart type for the data
- [ ] Accessible colors (colorblind-friendly)
- [ ] No unnecessary 3D effects
- [ ] Source cited (if using external data)
- [ ] Interactive features work (for Plotly)
Code Organization Patterns
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | |
Chapter 5 Checkpoint: Test Your Understanding
Question: You have a dataset with columns: date, sales, region, product_category. You want to show:
1. How sales change over time
2. Sales comparison across regions
3. The distribution of sales values
What chart types would you use for each, and would you use matplotlib or Plotly?
Click to reveal answer:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | |
Why Plotly? Interactive features let viewers explore the data themselves—hover for details, zoom into interesting regions, and click legend items to focus on specific categories.
Achievement Unlocked: Visual Storyteller
You can now transform raw numbers into compelling visual narratives. Whether you need static publication graphics (matplotlib), beautiful statistical plots (Seaborn), or interactive web-ready visualizations (Plotly), you have the tools. This is the skill that gets your insights SEEN.
Key Takeaways
-
Data visualization transforms numbers into insights that everyone can understand—it's how data science becomes visible.
-
Matplotlib is the foundational library with complete control; understand figures (canvas) and axes (plot areas).
-
Seaborn provides beautiful statistical plots with minimal code—great for exploration.
-
Plotly creates interactive visualizations with zoom, pan, hover tooltips—the modern standard for web and presentations.
-
Choose chart types based on your question: line for trends, scatter for relationships, bar for comparisons, histogram for distributions, box for statistical summaries.
-
Customize your plots: meaningful titles, axis labels with units, legends for multiple series, appropriate colors.
-
Subplots let you show multiple related views together for comprehensive analysis.
-
Save your work: PNG/PDF for static uses, HTML for interactive sharing.
-
Plotly Express (
px) provides high-level functions that create professional interactive charts in one line. -
The best visualization is one that answers a question clearly—not the fanciest chart, but the most appropriate one.
You've now mastered the art of visual communication. In the next chapter, you'll learn the statistical foundations that give your visualizations mathematical backing—the numbers behind the pictures!