Machine Learning and Graph ML

Summary

This chapter covers machine learning fundamentals and their application to graph-structured organizational data. Students learn about supervised and unsupervised learning, feature engineering, training and evaluation, and then advance to graph-specific ML techniques including graph neural networks, node embeddings, link prediction, and graph classification. The chapter also addresses bias in analytics as a critical consideration when applying ML to HR data.

Concepts Covered

This chapter covers the following 11 concepts from the learning graph:

Machine Learning
Supervised Learning
Unsupervised Learning
Feature Engineering
Training and Evaluation
Graph Machine Learning
Graph Neural Networks
Node Embeddings
Link Prediction
Graph Classification
Bias in Analytics

Prerequisites

This chapter builds on concepts from:

From Patterns to Predictions

"My antennae are tingling — we're onto something big! Up until now, we've been describing organizational networks. Today, we learn to predict what happens next." — Aria

Let's dig into this! Over the past several chapters, you've built an impressive toolkit. You can model organizational data as graphs, compute centrality scores, detect communities, measure similarity, and extract meaning from text. Those capabilities let you answer the question "What's happening in this organization right now?"

But leaders don't just want a snapshot. They want to know: Which employees are likely to leave? Where will the next cross-team collaboration emerge? Which departments are drifting into silos? These are prediction questions — and prediction is where machine learning enters the picture.

This chapter bridges two worlds. We'll start with the fundamentals of machine learning — supervised and unsupervised learning, feature engineering, and model evaluation — then advance to the exciting frontier of graph machine learning, where algorithms learn directly from the structure of your organizational network. By the end, you'll understand how graph neural networks, node embeddings, link prediction, and graph classification can transform your organizational analytics from descriptive to predictive.

We'll also spend serious time on a topic that matters deeply when you're pointing ML models at people data: bias in analytics. Because a model that predicts flight risk based on biased training data doesn't just produce wrong answers — it produces harmful ones.

Machine Learning: The 30,000-Foot View

Machine learning is the branch of artificial intelligence where systems learn patterns from data rather than following explicitly programmed rules. Instead of writing code that says "if tenure is less than 18 months and engagement score is below 3, flag as flight risk," you provide the algorithm with historical examples of employees who left and employees who stayed, and it discovers the patterns itself.

This distinction matters in organizational analytics because the patterns that predict outcomes like turnover, collaboration success, or leadership potential are often too complex, too multidimensional, and too context-dependent for hand-written rules to capture.

The machine learning workflow follows a consistent pattern regardless of the specific algorithm:

Step	What Happens	Organizational Example
1. Define the problem	Specify what you're trying to predict or discover	"Predict which employees will leave within 6 months"
2. Collect data	Gather relevant features and outcomes	Graph metrics, tenure, performance scores, event streams
3. Prepare features	Transform raw data into model-ready inputs	Compute centrality scores, normalize tenure, encode departments
4. Train the model	Algorithm learns patterns from historical data	Feed labeled examples of leavers and stayers
5. Evaluate	Measure how well the model generalizes	Test on held-out data the model hasn't seen
6. Deploy and monitor	Use the model in production and watch for drift	Score current employees monthly, retrain quarterly

There are two broad families of machine learning relevant to organizational analytics: supervised learning and unsupervised learning.

Diagram: ML Workflow Pipeline

ML Workflow Pipeline

Type: flowchart

Bloom Taxonomy: Understand (L2) Bloom Verb: describe Learning Objective: Students will describe the steps in a machine learning workflow and identify where organizational graph data enters the pipeline.

Purpose: Show the end-to-end ML workflow as a left-to-right pipeline, with annotations showing where graph-specific data and features enter the process.

Layout: Six connected stages flowing left to right: 1. "Define Problem" (indigo #303F9F) — example text: "Predict flight risk" 2. "Collect Data" (indigo #303F9F) — branches showing "Graph Metrics," "HR Records," "Event Streams" 3. "Engineer Features" (amber #D4880F) — example text: "Centrality, community, tenure" 4. "Train Model" (amber #D4880F) — example text: "Random forest, GNN" 5. "Evaluate" (amber #D4880F) — example text: "Precision, recall, AUC" 6. "Deploy & Monitor" (gold #FFD700) — example text: "Monthly scoring, quarterly retrain"

Arrows connect each stage sequentially. A feedback arrow from "Deploy & Monitor" loops back to "Collect Data" labeled "Retrain cycle."

Interactive elements: - Hover over each stage for a tooltip explaining that step - Click a stage to highlight it and show a brief organizational example beneath the diagram

Visual style: Clean pipeline with rounded rectangular stages. Aria color scheme. White background.

Responsive design: Wrap to two rows on narrow screens.

Implementation: p5.js with canvas-based hover and click detection

Supervised Learning

Supervised learning is the ML paradigm where you train a model on labeled examples — data points where you already know the correct answer. The model learns the relationship between input features and the output label, then applies that learned relationship to new, unseen data.

In organizational analytics, supervised learning powers some of the highest-value use cases:

Flight risk prediction — Given an employee's graph metrics, tenure, performance history, and communication patterns, will they leave within the next 6 months? The label is binary: left or stayed.
Performance classification — Based on collaboration patterns, mentoring relationships, and project involvement, will this employee receive a high performance rating? The label comes from historical performance reviews.
Promotion readiness — Does this employee's network position, skill profile, and trajectory match the patterns of people who were successfully promoted? The label is the historical promotion outcome.

The key requirement for supervised learning is labeled historical data. You need past examples where you know the outcome. For flight risk, that means historical records of employees who left (positive class) and employees who stayed (negative class), along with all the features that were measurable at the time before they left.

Common supervised learning algorithms used in organizational analytics include:

Logistic regression — Models the probability of a binary outcome. Simple, interpretable, and often surprisingly effective. A good starting point for flight risk prediction.
Random forests — Ensembles of decision trees that vote on the outcome. Handle mixed feature types well and provide feature importance rankings, which helps you understand why the model predicts what it predicts.
Gradient boosted trees (XGBoost, LightGBM) — Sequentially build trees that correct each other's mistakes. Often the highest-performing traditional ML approach for tabular organizational data.

Aria's Insight

Here's a secret that experienced data scientists know: for tabular organizational data — the kind stored in rows and columns with features like tenure, centrality scores, and department — gradient boosted trees almost always win. Neural networks get the headlines, but for structured HR data with dozens of features, a well-tuned XGBoost model is hard to beat. Save the neural networks for when we get to graph-structured learning later in this chapter.

Unsupervised Learning

Unsupervised learning works with unlabeled data — you don't tell the algorithm what to look for. Instead, it discovers hidden structure and patterns on its own. If supervised learning is answering a specific question, unsupervised learning is exploring the data and asking "what patterns exist here that I haven't thought to look for?"

In organizational analytics, unsupervised learning excels at discovery tasks:

Discovering informal teams — Clustering employees by their communication patterns, collaboration networks, and shared project histories reveals teams that don't appear on any org chart. These informal groups often predict how work actually gets done far better than formal department structures.
Identifying behavioral archetypes — Grouping employees by their network behavior (connectors, brokers, specialists, peripherals) helps you understand the different roles people play in the organizational network.
Anomaly detection — Flagging employees whose communication patterns suddenly change — a dramatic drop in cross-team connections, an unusual spike in after-hours activity, or a shift away from their normal collaboration group — can be an early warning of disengagement or burnout.

Key unsupervised learning techniques for organizational data include:

K-means clustering — Partitions employees into k groups based on feature similarity. Requires you to specify the number of clusters, but it's fast and interpretable.
Hierarchical clustering — Builds a tree of clusters that can be cut at different levels, revealing organizational groupings at multiple scales.
DBSCAN — Density-based clustering that can find arbitrarily shaped groups and automatically identifies outliers. Particularly useful for finding communication clusters that don't conform to neat boundaries.

The community detection algorithms you learned in Chapter 8 — Louvain, label propagation, modularity optimization — are actually forms of unsupervised learning that operate directly on graph structure. We're now extending that idea to incorporate node features alongside structural information.

Feature Engineering

Feature engineering is the process of transforming raw data into the input variables (features) that a machine learning model uses to make predictions. It's often called the most creative and impactful part of the ML workflow — and in organizational analytics, it's where your graph skills truly shine.

The fundamental insight is this: the graph metrics you've already learned to compute are features. Every centrality score, community membership, clustering coefficient, and path length you calculated in Chapters 7 and 8 becomes a powerful input variable for machine learning.

Here's how graph metrics translate into predictive features:

Graph Metric	ML Feature	What It Captures
Degree centrality	`degree_centrality`	How connected is this employee?
Betweenness centrality	`betweenness_centrality`	Does this person bridge groups?
Closeness centrality	`closeness_centrality`	How quickly can they reach everyone?
PageRank	`pagerank_score`	How influential are they in the network?
Clustering coefficient	`clustering_coeff`	Is their local network tightly knit?
Community ID	`community_id` (one-hot encoded)	Which informal group do they belong to?
Average path length to team	`avg_path_to_team`	How integrated are they with their team?
Cross-department edge count	`cross_dept_edges`	Do they collaborate outside their silo?

Beyond graph metrics, you'll combine features from multiple data sources to build a rich feature set:

# Example: Building a flight risk feature set
features = {
    # Graph-derived features
    'degree_centrality': 0.42,
    'betweenness_centrality': 0.15,
    'pagerank': 0.008,
    'clustering_coefficient': 0.67,
    'cross_dept_connections': 12,
    'communication_trend_30d': -0.23,  # declining

    # HR system features
    'tenure_months': 28,
    'time_since_promotion': 18,
    'performance_rating': 3.5,
    'salary_band_position': 0.45,  # 45th percentile in band

    # NLP-derived features (from Chapter 9)
    'avg_sentiment_30d': 0.12,  # slightly positive
    'sentiment_trend': -0.08,   # declining sentiment
}

The power of organizational analytics lies in this convergence. No single feature reliably predicts flight risk. But when you combine declining communication centrality, a drop in cross-department connections, flatlined promotion trajectory, and decreasing sentiment in written communications — the signal becomes strong.

Feature Leakage

Be careful about temporal alignment when engineering features. If you're predicting whether an employee will leave in the next 6 months, you must only use features that were available before the prediction window. Including data from the period when someone was already leaving — like a spike in recruiter website visits or a sudden drop in meeting attendance during their notice period — creates data leakage that inflates model performance artificially. Your model will look brilliant in testing and fail in production.

Training and Evaluation

Training and evaluation is the process of fitting your model to historical data and measuring how well it will perform on new, unseen data. This is where you learn whether your carefully engineered features actually predict anything useful.

The Train-Test Split

The fundamental principle is simple: never evaluate a model on the same data you used to train it. A model that memorizes its training data (overfitting) will score perfectly on training examples but fail miserably on new employees.

The standard approach is to split your data:

Training set (70-80%) — The model learns patterns from these examples
Test set (20-30%) — The model is evaluated on these held-out examples it has never seen

For organizational data, you should typically split by time rather than randomly. Train on employees from 2022-2024, test on employees from 2025. This temporal split reflects real-world usage — you're always predicting the future based on the past.

Evaluation Metrics

For classification tasks like flight risk prediction, the key metrics are:

Precision: Of all employees the model flagged as flight risks, what percentage actually left?

\[ \text{Precision} = \frac{\text{True Positives}}{\text{True Positives} + \text{False Positives}} \]

Recall: Of all employees who actually left, what percentage did the model identify?

\[ \text{Recall} = \frac{\text{True Positives}}{\text{True Positives} + \text{False Negatives}} \]

F1 Score: The harmonic mean of precision and recall, balancing both:

\[ F_1 = 2 \cdot \frac{\text{Precision} \cdot \text{Recall}}{\text{Precision} + \text{Recall}} \]

The tradeoff between precision and recall matters deeply in HR contexts. A flight risk model with high precision but low recall identifies a small number of at-risk employees with high confidence — but misses many who will actually leave. A model with high recall but low precision catches almost everyone who might leave — but also flags many false alarms, wasting managers' time and potentially creating anxiety for employees who aren't actually at risk.

Diagram: Precision-Recall Tradeoff

Precision-Recall Tradeoff

Type: interactive chart

Bloom Taxonomy: Analyze (L4) Bloom Verb: differentiate Learning Objective: Students will differentiate between precision and recall in the context of HR prediction models and analyze the consequences of optimizing for each.

Purpose: Visualize how adjusting a classification threshold affects precision and recall, with concrete organizational consequences shown for each setting.

Layout: - Top: Slider labeled "Classification Threshold" (0.0 to 1.0) - Center-left: Two vertical bar charts showing current Precision and Recall values - Center-right: A confusion matrix showing TP, FP, FN, TN counts with employee icons - Bottom: Text box showing the organizational consequence of the current threshold setting

Interactive elements: - Drag the threshold slider to see precision and recall change inversely - At low threshold (0.2): High recall (~0.95), low precision (~0.30) — consequence text: "You flag 50 employees as flight risks. 15 actually leave, but 35 were false alarms. Managers are overwhelmed with intervention meetings." - At high threshold (0.8): Low recall (~0.40), high precision (~0.85) — consequence text: "You flag 8 employees as flight risks. 7 actually leave, but you missed 10 others who also left. Those surprised departures cost the organization." - At balanced threshold (0.5): Moderate both (~0.70/~0.70) — consequence text: "You flag 20 employees. 14 actually leave, 6 were false alarms, but you missed 4 who left. A reasonable tradeoff."

Sample data: 100 employees, 18 actually left in the test period.

Color scheme: Precision bar in indigo (#303F9F), Recall bar in amber (#D4880F). TP in green, FP in amber, FN in red, TN in gray.

Responsive design: Stack charts vertically on narrow screens.

Implementation: p5.js with canvas-based slider and dynamic bar charts

The Confusion Matrix

A confusion matrix provides the complete picture of classification performance:

	Predicted: Stay	Predicted: Leave
Actually Stayed	True Negative (TN)	False Positive (FP)
Actually Left	False Negative (FN)	True Positive (TP)

In organizational analytics, the costs of errors are asymmetric. A false positive (predicting someone will leave when they won't) might lead to an unnecessary retention conversation — awkward, but not catastrophic. A false negative (failing to predict someone who does leave) means you lose a valuable employee you could have retained with timely intervention. Most HR teams should optimize for recall while keeping precision above a practical threshold.

Graph Machine Learning

Now we arrive at the frontier. Everything we've covered so far — supervised learning, unsupervised learning, feature engineering — works with traditional tabular data where each employee is a row and features are columns. Graph machine learning takes a fundamentally different approach: it learns directly from the structure of the network itself.

Why does this matter? Because traditional ML treats each employee as an independent data point. It can use graph-derived features (like centrality scores), but it can't capture the rich, recursive structure of who-knows-whom. In a graph, an employee's context is defined not just by their own attributes, but by the attributes and connections of their neighbors, their neighbors' neighbors, and so on.

Graph ML captures this relational context natively. It answers questions that traditional ML simply can't:

Which structural positions in the network predict certain outcomes?
How does an employee's entire neighborhood influence their behavior?
Where will new connections form next in the organizational network?
Do certain subgraph patterns correspond to high-performing teams?

The field of graph machine learning has exploded in recent years, driven by advances in graph neural networks, scalable embedding algorithms, and the growing availability of graph-structured datasets. For organizational analytics, it represents the next generation of predictive capability.

"You know how ant colonies use collective intelligence — no single ant knows the whole picture, but the colony collectively solves optimization problems that would stump any individual? Graph ML works the same way. Each node learns from its neighbors, and the network's collective structure becomes the signal. It's ant colony optimization meets artificial intelligence, and honestly, it makes my antennae tingle." — Aria

Graph Neural Networks

Graph neural networks (GNNs) are neural networks specifically designed to operate on graph-structured data. Unlike traditional neural networks that expect fixed-size input vectors, GNNs can process graphs of any size and shape, learning representations that incorporate both node features and network topology.

The core idea behind GNNs is message passing. At each layer of the network, every node:

Gathers feature information from its neighbors
Aggregates that information (by summing, averaging, or applying an attention mechanism)
Updates its own representation by combining its current features with the aggregated neighbor information

After several rounds of message passing, each node's representation encodes information not just about itself, but about its local network neighborhood. A two-layer GNN means each node has gathered information from nodes up to two hops away — exactly the kind of structural context that matters for organizational analytics.

The message passing update for a node \( v \) at layer \( k \) can be expressed as:

\[ h_v^{(k)} = \text{UPDATE}\left(h_v^{(k-1)},\; \text{AGGREGATE}\left(\left\{h_u^{(k-1)} : u \in \mathcal{N}(v)\right\}\right)\right) \]

where \( h_v^{(k)} \) is the representation of node \( v \) at layer \( k \), \( \mathcal{N}(v) \) is the set of neighbors of \( v \), and UPDATE and AGGREGATE are learned functions.

Diagram: GNN Message Passing

GNN Message Passing

Type: animated diagram

Bloom Taxonomy: Understand (L2) Bloom Verb: explain Learning Objective: Students will explain how graph neural networks aggregate neighborhood information through message passing and how multiple layers capture multi-hop context.

Purpose: Animate the message passing process on a small organizational graph, showing how each node's representation evolves by incorporating neighbor information across multiple layers.

Layout: - A small graph with 7 nodes arranged in a cluster: one central "target" employee node connected to 4 direct neighbors, with 2 additional second-hop neighbors - Each node displays a small feature vector (3-4 values) that visually updates each round - Control buttons: "Layer 1," "Layer 2," "Reset"

Animation sequence: - Initial state: Each node shows its original features (e.g., [centrality, tenure, performance]) - Layer 1: Arrows animate from neighbors to target node. Target node's feature vector visually blends with neighbor features (color mixing effect). All nodes update simultaneously. - Layer 2: Repeat, but now each neighbor already contains information from their neighbors. Target node now has 2-hop context. Its feature vector shows a richer color blend.

Node types: - Target employee (large circle, indigo #303F9F) — "Maria" - Direct neighbors (medium circles, amber #D4880F) — 4 colleagues - Second-hop neighbors (smaller circles, light amber) — 2 more distant colleagues

Interactive elements: - Click "Layer 1" to animate the first round of message passing - Click "Layer 2" to animate the second round - Hover over any node to see its current feature vector values - "Reset" returns all nodes to their original features

Visual style: Clean graph with animated message arrows. Feature vectors shown as small colored bars inside each node. Aria color scheme.

Implementation: p5.js with canvas-based animation and button controls

In practice, GNNs for organizational analytics are typically implemented using frameworks like PyTorch Geometric or DGL (Deep Graph Library). You don't need to implement message passing from scratch — these libraries handle the graph operations while you focus on defining the architecture and preparing the data.

Common GNN architectures include:

GCN (Graph Convolutional Network) — The foundational GNN architecture. Aggregates neighbor features with a normalized sum. Simple and effective.
GraphSAGE — Samples a fixed number of neighbors rather than using all of them. Scales better to large organizational graphs.
GAT (Graph Attention Network) — Uses attention mechanisms to learn which neighbors matter more. Particularly relevant in organizational settings where not all connections carry equal weight.

Node Embeddings

Node embeddings are low-dimensional vector representations of nodes that capture their structural role and neighborhood in the graph. Think of them as a way to compress a node's entire graph context into a compact numerical vector — typically 64 to 256 dimensions — that can be used as input to any machine learning algorithm.

The key insight is that nodes with similar structural positions in the graph should have similar embeddings. Two employees who serve as bridges between the same pair of departments should have embeddings close together in vector space, even if their individual attributes (title, tenure, salary) are completely different.

The dimensionality of the embedding vector represents a tradeoff:

\[ d = C \cdot \log_2(|V|) \]

where \( d \) is the embedding dimension, \( |V| \) is the number of nodes, and \( C \) is a constant typically between 2 and 8. For an organization with 10,000 employees, this suggests embedding dimensions between 26 and 104. In practice, 64 or 128 dimensions are common choices.

Popular node embedding algorithms include:

Node2Vec — Performs biased random walks on the graph, then uses a skip-gram model (similar to Word2Vec from NLP) to learn embeddings. The bias parameters let you control whether walks emphasize local neighborhood structure (like clustering coefficient) or global structural roles (like bridge positions). This flexibility makes Node2Vec particularly powerful for organizational networks where both local team dynamics and cross-organizational roles matter.
DeepWalk — A precursor to Node2Vec that uses unbiased random walks. Simpler but less flexible.
Graph Autoencoders — Neural network approaches that learn to compress and reconstruct the graph's adjacency structure. The compressed representation becomes the embedding.

Once you have node embeddings, you can use them for virtually any downstream task:

Feed them into a classifier for flight risk prediction
Cluster them to discover informal organizational groups
Compute distances between employees to find structurally similar people
Visualize them in 2D (using t-SNE or UMAP) to create organizational network maps

Embedding Method	Walk Strategy	Best For	Scalability
DeepWalk	Uniform random walks	General-purpose embeddings	Good
Node2Vec	Biased walks (BFS/DFS blend)	Capturing local and global structure	Good
Graph Autoencoders	Neural reconstruction	Learning from node features + structure	Moderate
GNN-based	Message passing	Joint feature and structure learning	Moderate

Link Prediction

Link prediction answers one of the most valuable questions in organizational analytics: Where will new connections form? Given the current state of the organizational network, which pairs of employees who aren't yet connected are likely to collaborate, communicate, or develop working relationships in the future?

This capability has immediate practical applications:

Predicting future collaborations — Identifying pairs of employees who should be working together but aren't yet connected. An introduction or a shared project assignment could catalyze a valuable collaboration.
Anticipating mentoring relationships — Finding senior employees whose network position and expertise align with junior employees who could benefit from mentoring.
Breaking down silos — Detecting potential cross-department bridges before they form naturally, and actively encouraging those connections.
Organizational design — Predicting how restructuring will affect collaboration patterns before implementing changes.

Link prediction works by scoring every possible pair of unconnected nodes and ranking them by likelihood of forming a connection. The scoring can use:

Topology-based methods (using graph structure alone):

Common Neighbors — Two employees who share many connections are likely to connect themselves. Simple but effective.
Jaccard Coefficient — The number of shared neighbors divided by the total number of unique neighbors. Normalizes for node degree.
Adamic-Adar Index — Weights shared neighbors by the inverse log of their degree. A shared connection through a selective connector is worth more than one through a hub who connects to everyone.

Embedding-based methods (using learned representations):

Compute node embeddings for all employees, then score pairs by the similarity (cosine similarity or dot product) of their embeddings. Pairs with high embedding similarity occupy similar structural positions and are likely to form connections.

GNN-based methods (end-to-end learning):

Train a GNN to directly predict whether an edge should exist between two nodes. The model learns the complex nonlinear patterns that predict new connections.

Diagram: Link Prediction Visualization

Link Prediction Visualization

Type: interactive graph

Bloom Taxonomy: Apply (L3) Bloom Verb: predict Learning Objective: Students will apply link prediction scoring to an organizational network and predict which new connections are most likely to form.

Purpose: Show an organizational graph where predicted future edges are displayed as dashed lines with confidence scores, allowing students to explore why certain connections are predicted.

Layout: - An organizational graph with 12-15 employee nodes across 3 departments (color-coded) - Existing edges shown as solid lines - Predicted edges shown as dashed amber (#D4880F) lines with a probability score label - Control panel with method selector and threshold slider

Nodes: Employees colored by department: - Engineering (indigo #303F9F) — 5 nodes - Product (amber #D4880F) — 4 nodes - Marketing (gold #FFD700) — 4 nodes

Interactive elements: - Dropdown to select prediction method: "Common Neighbors," "Jaccard Coefficient," "Embedding Similarity" - Threshold slider (0.0 to 1.0) to show/hide predictions by confidence - Hover over a predicted edge to see: the two employees, the score, the number of common neighbors, and a brief explanation of why the connection is predicted - Hover over a node to highlight all its existing and predicted connections - Click a node to pin it

Visual style: Force-directed layout with departments loosely clustered. Predicted edges use dashed lines with varying thickness based on confidence score.

Implementation: p5.js with force-directed positioning, canvas-based controls

Graph Classification

Graph classification takes prediction to a higher level of abstraction. Instead of classifying individual nodes (employees) or predicting individual edges (relationships), graph classification assigns labels to entire graphs or subgraphs.

In organizational analytics, this means classifying groups — teams, departments, project groups, or organizational units — based on their internal network structure:

Team effectiveness prediction — Given the communication network of a team, classify it as high-performing, average, or underperforming. High-performing teams tend to have specific structural signatures: high internal density, strong connections to other teams, distributed rather than centralized communication, and the presence of both boundary spanners and integrators.
Organizational health assessment — Classify a department's network as healthy, at-risk, or siloed based on structural metrics. Healthy departments show balanced communication, redundant paths, and no single points of failure.
Project success prediction — Given the collaboration graph of a project team at the start of a project, predict whether the project will meet its goals. Research has shown that the structural diversity of a project team's external connections is a strong predictor of innovation outcomes.

Graph classification typically works by:

Computing a graph-level representation — This can be done by aggregating node embeddings (mean, max, or attention-weighted pooling) or by using a dedicated graph pooling layer in a GNN.
Feeding that representation into a classifier — A standard neural network or even a simpler model like logistic regression.

The loss function for graph classification follows the standard cross-entropy form:

\[ \mathcal{L} = -\frac{1}{N}\sum_{i=1}^{N} \sum_{c=1}^{C} y_{ic} \log(\hat{y}_{ic}) \]

where \( N \) is the number of graphs (teams), \( C \) is the number of classes, \( y_{ic} \) is the true label, and \( \hat{y}_{ic} \) is the predicted probability.

The following table summarizes the four main graph ML task types and their organizational applications:

Task Type	Input	Output	Organizational Example
Node classification	A node and its neighborhood	Label for that node	Flight risk score for an employee
Edge classification	A pair of connected nodes	Label for that edge	Communication type (formal/informal)
Link prediction	A pair of unconnected nodes	Probability of connection	Future collaboration likelihood
Graph classification	An entire subgraph	Label for the graph	Team performance category

Bias in Analytics

"This is where I get serious for a moment. When we point machine learning at people data, we're not just building models — we're building systems that influence careers, opportunities, and lives. A biased model isn't a neutral tool with a math problem. It's an engine that can automate and amplify the very inequities we should be working to dismantle. Follow the trail carefully here — this matters more than any algorithm." — Aria

Bias in analytics is the systematic distortion of analytical results in ways that unfairly advantage or disadvantage particular groups. When machine learning models are trained on organizational data, they can absorb, perpetuate, and even amplify existing patterns of inequality. This isn't a theoretical concern — it's a documented reality in HR analytics.

Data Bias

The training data itself carries the fingerprints of every historical inequity in the organization. If women have historically been promoted at lower rates than equally qualified men, a model trained on that data will learn that being female is a negative signal for promotion readiness — not because it should be, but because that's what the data shows. The model doesn't know the data reflects bias; it just finds patterns.

Sources of data bias in organizational analytics include:

Historical discrimination — Past decisions about hiring, promotion, compensation, and project assignments reflect conscious and unconscious biases. Models trained on these outcomes inherit those biases as learned patterns.
Representation gaps — If certain groups are underrepresented in leadership positions, the model has fewer positive examples to learn from, leading to systematically lower predictions for those groups.
Measurement bias — Communication data may not capture all forms of contribution equally. Employees who contribute through informal mentoring, emotional support, or behind-the-scenes problem-solving may appear less connected in email and chat logs than those who communicate visibly.
Network structure bias — Graph-based features can encode structural inequities. If members of a particular demographic group have historically been excluded from informal networks (the "old boys' club"), their lower centrality scores reflect systemic exclusion, not individual capability.

Algorithmic Bias

Even with unbiased data (which, in practice, doesn't exist), algorithms can introduce their own biases:

Feature selection bias — Choosing features that correlate with protected characteristics (like "golf club membership" or "fraternity affiliation") introduces proxy discrimination even when protected attributes are excluded from the model.
Optimization bias — ML algorithms optimize for overall accuracy, which means they perform best on the majority group. A flight risk model that's 90% accurate overall might be 95% accurate for the majority demographic and only 70% accurate for underrepresented groups.
Embedding bias — Node embeddings learned from biased network structures will encode those biases. If a GNN learns that certain positions in the network predict success, and those positions are disproportionately occupied by one demographic group, the embeddings carry that bias forward.

Feedback Loops

Perhaps the most insidious form of bias in organizational ML is the feedback loop. When a biased model's predictions influence real-world decisions, and those decisions generate the training data for future models, bias compounds over time:

A flight risk model predicts that employees in a certain demographic are higher risk (based on biased historical data).
Managers, acting on those predictions, invest less in those employees' development (conscious or unconscious response to the prediction).
Those employees, receiving less development, actually do leave at higher rates.
The new data confirms the model's original (biased) prediction, and the next version of the model becomes even more biased.

This cycle can be extremely difficult to detect because the model's predictions appear to be validated by outcomes — outcomes that the model itself helped create.

Diagram: Bias Feedback Loop

Bias Feedback Loop

Type: cycle diagram

Bloom Taxonomy: Evaluate (L5) Bloom Verb: critique Learning Objective: Students will critique how biased ML predictions can create self-reinforcing feedback loops in HR decision-making and evaluate strategies for breaking the cycle.

Purpose: Illustrate the four-stage feedback loop where biased predictions influence decisions, which generate biased outcomes, which reinforce the biased model.

Layout: Four stages arranged in a clockwise circle with arrows connecting them: 1. "Biased Training Data" (top, indigo #303F9F) — "Historical patterns reflect systemic inequities" 2. "Biased Model Predictions" (right, amber #D4880F) — "Model learns and reproduces biased patterns" 3. "Biased Decisions" (bottom, red #D32F2F) — "Predictions influence management actions" 4. "Biased Outcomes" (left, amber-dark #B06D0B) — "Actions create data that confirms the bias"

A large circular arrow connects all four stages. In the center: "Self-Reinforcing Cycle" with a warning icon.

An additional element: A "Break the Cycle" intervention box (green) connected to the arrow between stages 2 and 3, showing mitigation strategies: - "Fairness-aware algorithms" - "Human review of predictions" - "Disparate impact testing" - "Regular bias audits"

Interactive elements: - Click each stage to see a detailed organizational example - Click "Break the Cycle" to see mitigation strategies expand - Hover over arrows to see how each transition works

Visual style: Clean cycle diagram with bold colors. Warning/serious tone. Aria color scheme with red for the "Biased Decisions" stage to signal danger.

Implementation: p5.js with canvas-based click and hover interactions

Mitigating Bias

Addressing bias in organizational ML requires action at every stage of the pipeline:

Before training:

Audit training data for representation and outcome disparities across demographic groups
Apply resampling or reweighting to correct historical imbalances
Remove or transform features that serve as proxies for protected characteristics
Document data lineage — know where your training data came from and what decisions shaped it

During training:

Use fairness-aware algorithms that incorporate equity constraints into the optimization objective
Apply adversarial debiasing — train a secondary model to predict protected attributes from the primary model's outputs, and penalize the primary model when the adversary succeeds
Test multiple model architectures and select based on fairness metrics, not just overall accuracy

After deployment:

Monitor model predictions for disparate impact across demographic groups
Implement human-in-the-loop review for high-stakes predictions (promotions, terminations, performance ratings)
Conduct regular bias audits and retrain models with updated, audited data
Establish clear governance — who is responsible when a model produces biased outcomes?

The Stakes Are Real

Bias in organizational ML isn't an abstract ethical concern. It can violate employment law (disparate impact under Title VII in the U.S.), expose organizations to litigation, harm individuals' careers and livelihoods, and erode trust in analytics programs. If your flight risk model systematically over-predicts departure for a protected group, and management acts on those predictions by withholding development opportunities, you have created a legally actionable discriminatory system — even if no one intended to discriminate. Build fairness testing into your ML pipeline from day one, not as an afterthought.

Putting It Together: A Graph ML Pipeline for Flight Risk

Let's walk through a complete example that ties together everything in this chapter. You're building a flight risk prediction system for a 5,000-person organization.

Step 1: Feature Engineering. You extract graph metrics from the organizational communication graph — degree centrality, betweenness centrality, PageRank, clustering coefficient, community membership, and cross-department edge count for each employee. You combine these with HR features (tenure, time since promotion, performance rating) and NLP features (sentiment trend from communications).

Step 2: Node Embeddings. You run Node2Vec on the communication graph to generate 128-dimensional embeddings for every employee. These embeddings capture each person's structural role in the network in ways that hand-crafted features can't fully represent.

Step 3: Training. You combine the engineered features and node embeddings into a single feature vector for each employee. Using historical data (employees who left vs. stayed over the past two years), you train a gradient boosted tree model. You also train a GNN-based model that learns directly from the graph structure and node features.

Step 4: Evaluation. You evaluate both models on a held-out test set from the most recent six months. You check precision, recall, F1, and AUC-ROC. You also conduct a fairness audit — checking whether the model's accuracy and error rates are consistent across gender, race, age, and other protected characteristics.

Step 5: Deployment. The model scores all current employees monthly. Predictions are reviewed by HR business partners before any action is taken. A quarterly bias audit checks for disparate impact. The model is retrained every six months with fresh data.

Chapter Summary

Let's stash the big ideas before we move on:

Machine learning enables organizational analytics to move from description to prediction — learning patterns from historical data to forecast future outcomes like flight risk, collaboration potential, and team performance.
Supervised learning requires labeled historical data and powers high-value prediction tasks: flight risk, performance classification, and promotion readiness. Gradient boosted trees are typically the strongest approach for tabular organizational data.
Unsupervised learning discovers hidden structure without labels — finding informal teams, behavioral archetypes, and communication anomalies that don't appear on any org chart.
Feature engineering transforms graph metrics (centrality, clustering coefficient, community membership) into powerful predictive features. The graph skills you've already learned become inputs to ML models.
Training and evaluation requires temporal splitting (train on the past, test on the future), and careful attention to precision-recall tradeoffs that have real consequences for employees.
Graph machine learning learns directly from network structure, capturing relational context that traditional tabular ML cannot. It represents the next generation of organizational analytics capability.
Graph neural networks use message passing to aggregate neighborhood information, allowing each node to learn from its local graph context across multiple hops.
Node embeddings compress a node's structural role into compact vectors (typically 64-128 dimensions) that can feed any downstream ML algorithm. Node2Vec's biased random walks are particularly well-suited to organizational networks.
Link prediction forecasts where new connections will form — predicting future collaborations, mentoring relationships, and cross-department bridges before they emerge naturally.
Graph classification evaluates entire teams or subgraphs, enabling predictions about team effectiveness, organizational health, and project success based on network structure.
Bias in analytics is not optional reading — it's the difference between building systems that help people and systems that harm them. Data bias, algorithmic bias, and feedback loops can automate discrimination at scale. Fairness testing belongs in your ML pipeline from day one.

You've just added the predictive layer to your organizational analytics toolkit. In Chapter 11, you'll apply these techniques to real organizational insights — the specific patterns, signals, and interventions that make organizations healthier, more connected, and more resilient.

Six legs, one insight at a time. And this time, those insights can see around corners.

See Annotated References