Skip to content

Data Science Hero's Journey

Run the Data Science Hero's Journey Fullscreen Edit the Data Science Hero's Journey Using the p5.js Editor

About This MicroSim

This interactive visualization presents the data science workflow as a hero's journey, inspired by Joseph Campbell's monomyth structure. Just as every great hero follows a journey from the ordinary world through trials and transformation, data scientists follow a similar path from initial questions to actionable insights.

The Seven Stages

  1. Problem Definition - "The Call to Adventure"
  2. Every data science project begins with a question
  3. Vague questions lead to vague answers

  4. Data Collection - "Gathering Allies"

  5. Seek data from databases, surveys, APIs, and experiments
  6. Often the most challenging part of the journey

  7. Data Cleaning - "Trials and Tribulations"

  8. Fix errors, handle missing values, tame the chaos
  9. The hero must face challenges before reaching the goal

  10. Exploratory Analysis - "The Revelation"

  11. Visualize and explore to find patterns
  12. The fog begins to clear as insights emerge

  13. Modeling - "Forging the Weapon"

  14. Build your predictive model
  15. Create the tool that will help you conquer uncertainty

  16. Evaluation - "The Ultimate Test"

  17. Does your model actually work?
  18. Test your creation against reality

  19. Communication - "Return with the Elixir"

  20. Share your discoveries with the world
  21. The journey is complete when knowledge is shared

Interactive Features

  • Hover over any stage to see a detailed description
  • Click any stage to view real-world examples
  • Auto-animate toggle cycles through stages with a glowing effect
  • Return arrows show the iterative nature of data science (dotted lines)

Learning Objective

Help students understand the iterative nature of the data science workflow and see it as an adventure rather than a checklist. The circular layout emphasizes that data science is not a linear process - you often need to return to earlier stages as you learn more.

Embedding This MicroSim

You can include this MicroSim on your website using the following iframe:

1
<iframe src="https://dmccreary.github.io/data-science-course/sims/data-science-heros-journey/main.html" height="602px" scrolling="no"></iframe>

Lesson Plan

Introduction (5 minutes)

  • Discuss the concept of the hero's journey in storytelling
  • Ask students: "What challenges do heroes face on their journeys?"

Exploration (10 minutes)

  • Have students explore the MicroSim, hovering over each stage
  • Ask them to click on stages and read the real-world examples
  • Discuss: "Which stage do you think is most challenging? Why?"

Discussion (10 minutes)

  • Focus on the dotted "return" arrows
  • Ask: "Why would a data scientist need to go back to earlier stages?"
  • Examples: Model not working might mean bad data, new questions emerge from findings

Activity (15 minutes)

  • Give students a scenario (e.g., "Predict which students might need tutoring")
  • Have them walk through all 7 stages, describing what they would do at each
  • Emphasize that it's okay to go back - that's part of the process!

References

  • Campbell, Joseph. "The Hero with a Thousand Faces" (1949)
  • CRISP-DM: Cross-Industry Standard Process for Data Mining
  • Wickham, Hadley. "R for Data Science" - Data Science Workflow chapter