Human Security: Identity, Authentication, and Social Engineering¶
Summary¶
Centers on the people in the system: identity and access management, single sign-on, federated identity, password authentication and policy, multi-factor authentication, biometrics, FIDO2 and passkeys, OAuth 2.0, OIDC, SAML, social engineering and its variants (phishing, spear phishing, pretexting, baiting), usable security, awareness training, insider threat, just-in-time access, and privacy engineering.
Concepts Covered¶
This chapter covers the following 28 concepts from the learning graph:
- Identity Management
- Access Management
- IAM System
- Single Sign-On
- Federated Identity
- Password Authentication
- Password Policy
- Multi-Factor Authentication
- Biometric Authentication
- FIDO2
- Passkey
- OAuth 2.0
- OIDC
- SAML
- Social Engineering
- Phishing
- Spear Phishing
- Pretexting
- Baiting
- Usable Security
- Security Awareness Training
- Insider Threat
- Privilege Creep
- Just-in-Time Access
- Privacy Engineering
- Privacy by Design
- Data Minimization
- User Behavior Analytics
Prerequisites¶
This chapter builds on concepts from:
- Chapter 1: Security Foundations: Properties, Mindset, and Risk
- Chapter 2: Threats, Vulnerabilities, and Security Controls
- Chapter 5: Software Vulnerabilities and Secure Coding
- Chapter 8: Network Security Foundations: Protocols, Firewalls, and Detection
- Chapter 11: Cloud Security and Operations Monitoring
Welcome to the Human Layer
Welcome. Up to now we have hardened code, networks, and systems against adversaries who attack machines. This chapter turns to the layer those machines exist to serve — the people. Identity, authentication, and the social games adversaries play to bypass technical controls live here. Trust, but verify, and remember that the verifier is usually a tired human at a screen.
1. Why Human Security Is Its Own Discipline¶
Most of this textbook treats security as a property of code, networks, and systems. Real attackers, however, do not always have to break the technical stack. They often go around it, by convincing a human to do the breaking for them. The 2020 Twitter incident — in which a teenager talked his way to admin tools and hijacked celebrity accounts — was not a software vulnerability. The 2022 Uber breach was a contractor whose multi-factor prompts were spammed until he tapped "approve." The 2023 MGM Resorts outage began with a 10-minute LinkedIn-and-helpdesk phone call. These are human-layer failures, and they require their own threat models, controls, and habits of mind.
This chapter studies that layer in three movements. The first is identity and access management — how systems decide who someone is and what they may do. The second is authentication — the specific mechanisms (passwords, MFA, biometrics, passkeys) by which a human or machine proves a claim of identity. The third is adversarial human behavior — social engineering attacks, the cognitive biases they exploit, and the principles of usable security and privacy engineering that make attacks harder and good behavior easier.
The pervasive theme is that humans are not the weakest link to be blamed and trained around. They are participants in a sociotechnical system. When users make insecure choices, the system has usually given them no easy secure choice. Defensible design fixes the system.
2. Identity and Access Management¶
Two related but distinct problems sit at the foundation of human security. Identity management is the discipline of representing, provisioning, and lifecycle-managing the digital identities of users, services, and devices in an organization. Access management is the discipline of deciding, at runtime, what those identities are allowed to do. Together they answer two questions every request must answer: who are you? and are you allowed to do this?
An IAM system is the integrated platform that handles both. A modern IAM system stores user records (usually in a directory like Active Directory, LDAP, or a cloud identity provider such as Okta, Microsoft Entra ID, or AWS IAM Identity Center), enforces authentication policies, issues tokens that carry identity claims, and integrates with downstream applications to enforce authorization decisions. The IAM system is one of the most security-critical pieces of infrastructure an organization owns: an attacker with control of the IAM is an attacker with control of every application that trusts it.
Before we examine how an IAM system fits into a request flow, let us define three terms that will recur. An identity provider (IdP) is the service that authenticates a user and issues a token attesting to the result. A service provider (SP), sometimes called a relying party (RP), is the application that consumes the token and decides what the user may do. A claim is a statement inside the token — this user is alice@example.com, this user is in the engineering group, this token was issued at 14:02 UTC.
Diagram: IAM Request Flow¶
End-to-end flow showing user, identity provider, and service provider exchanging an identity token
Type: workflow-diagram
sim-id: iam-request-flow
Library: Mermaid
Status: Specified
A horizontal sequence diagram with four lanes: User, Browser/Client, Identity Provider (IdP), Service Provider (SP).
- User → Browser: Visits SP application URL.
- Browser → SP: GET /protected-resource (no session).
- SP → Browser: 302 redirect to IdP with an authentication request.
- Browser → IdP: Authentication request.
- IdP ↔ User: Prompts for credentials (password + MFA).
- IdP → Browser: Issues signed identity token; redirects back to SP.
- Browser → SP: Presents identity token.
- SP: Verifies token signature, extracts claims (user id, groups, expiry).
- SP → Browser: Returns the protected resource.
Annotations on the side: "IdP authenticates. SP authorizes. The token is the only thing the SP trusts about who the user is — so its signature, expiry, and audience all matter."
Color: cybersecurity blue (#1565c0) for IdP lane, slate (#455a64) for SP lane, white for user/browser. Responsive: vertical stacking under 700px viewport.
Implementation: Mermaid sequenceDiagram with custom theme colors.
The IAM system is also responsible for the lifecycle of an identity: provisioning a new account when an employee joins, updating group memberships when they change roles, and — most importantly — deprovisioning the account when they leave. The phrase privilege creep describes what happens when this lifecycle is broken. Over time, a long-tenured employee accumulates entitlements from every project they touched, and nobody removes the ones they no longer need. After ten years, the employee has standing access to systems and data far beyond what their current role requires. If their account is compromised, the blast radius is the union of everything they have ever worked on. The structural fix is attestation (periodic review where managers confirm or remove each entitlement) plus just-in-time access (granting privilege only for the duration of a specific task), which we revisit later.
3. Single Sign-On and Federated Identity¶
If every application maintained its own user database and password file, two things would happen. Users would face the password fatigue problem — too many passwords to remember, leading to reuse across sites, which is the most reliable way for a breach at one service to propagate to another. And administrators would face an offboarding nightmare — a departing employee's account has to be disabled in every application separately, with high probability that one is missed.
Single sign-on (SSO) is the solution: a user authenticates once to a central identity provider, and that authentication is carried (via short-lived signed tokens) to every application the user accesses. The user types one password and one MFA code per workday; the applications never see the password at all. From a security perspective, SSO is a major win because it concentrates authentication policy in one place where strong controls (MFA, anomaly detection, session monitoring) can be applied uniformly.
Federated identity generalizes SSO across organizational boundaries. Instead of just sharing identity within one company's apps, federation lets an identity issued by one organization be trusted by another — Sign in with Google on a third-party site, or a partner company's employees accessing your contractor portal using their employer's identity. Federation runs on cryptographically signed tokens (and on the careful definition of who trusts whom for what). The dominant federation protocols are SAML, OAuth 2.0, and OIDC, which we examine in Section 5.
The benefits of SSO are real, but the model has one structural property worth naming: SSO concentrates risk. A successful attack on the IdP — credential stuffing the admin account, exfiltrating signing keys, exploiting a federation bug — gives the attacker access to every downstream application. SSO is not a reason to weaken IdP security; it is a reason to invest disproportionately in it.
Concentration as a Design Choice
Notice that SSO is a deliberate concentration of trust. We exchange "many weak doors" for "one strong door." That is a defensible choice when the strong door is genuinely strong and continuously monitored. It is a catastrophic choice when the IdP is treated as a commodity service. The question is not "should we do SSO?" but "is the IdP getting the security investment a single point of trust deserves?"
4. Authentication Mechanisms¶
Authentication mechanisms answer the question how does the verifier convince itself that the claimant is who they say they are? The classic taxonomy lists three authentication factors:
- Something you know — a password, PIN, or answer to a security question.
- Something you have — a hardware token, a phone, a smart card, a cryptographic key on a device.
- Something you are — a biometric: fingerprint, face, voice, iris.
A fourth category — something you do (typing rhythm, gait) — is sometimes added under behavioral biometrics. Multi-factor authentication (MFA) requires evidence from two or more different categories. Two passwords are not MFA. A password plus a hardware key is.
4.1 Password Authentication¶
Password authentication is the oldest and still most pervasive mechanism. The user supplies a secret string; the verifier checks it against a stored value. The verifier should never store the password itself. Instead, it stores a salted, slow hash (using a memory-hard function such as Argon2id, scrypt, or bcrypt — never plain SHA-256, and never MD5). On login, the verifier hashes the submitted password with the stored salt and compares. This pattern is covered in Chapter 3 and Chapter 5; here we focus on the policy and human-factor questions.
A password policy is the set of rules the system enforces about what passwords are acceptable: length, character composition, rotation frequency, history (no reuse of recent passwords), and account-lockout behavior on failed attempts. For decades, conventional wisdom was that strong policies meant complex composition rules ("must include uppercase, lowercase, digit, symbol") and frequent rotation ("change every 90 days"). Modern guidance, codified in NIST SP 800-63B, has reversed much of this.
Before we examine a comparison table, let us define the modern recommendations we will compare against. NIST 800-63B-3 (2017, revised through 2024) recommends: minimum 8 characters with no maximum below 64, no composition rules, no scheduled rotation, screening against a list of known-breached passwords, and rate-limiting failed attempts rather than locking accounts.
| Policy element | Old (NIST 800-63 pre-2017) | Modern (NIST 800-63B) | Why the change |
|---|---|---|---|
| Minimum length | 8 characters | 8+ (encourage 15+) | Length is the dominant strength factor |
| Composition rules | Require upper/lower/digit/symbol | Disallow no characters | Complexity rules push users to predictable patterns (Passw0rd!) |
| Rotation | Every 60–90 days | Only on evidence of compromise | Forced rotation produces weaker, transformation-based passwords |
| Hint / security questions | Allowed | Disallowed | Easily guessable from public information |
| Breached-password screening | Not required | Required | Stops credential stuffing from public dumps |
| Account lockout | Hard lockout after N tries | Rate-limit; optionally CAPTCHA | Hard lockout is a denial-of-service vector |
The deeper lesson is that password policies should be optimized for resistance to the attacks that actually happen — credential stuffing from breach dumps and online guessing — not for theoretical entropy. A 16-character passphrase the user can remember beats an 8-character random string the user writes on a sticky note.
MFA Fatigue Is The Modern Footgun
Push-notification MFA looks safe and isn't, when an attacker who already has the password can spam approval prompts until a tired user taps "yes." This pattern looks fine and isn't — the structural fix is number matching (the user must type a code shown on the login screen) or moving to phishing-resistant MFA (FIDO2/passkeys), where there is no prompt to mistakenly approve.
4.2 Multi-Factor Authentication¶
Multi-factor authentication raises the bar by requiring evidence from at least two factor categories. The factors should be independent: an attacker who phishes the password should not, in the same step, also get the second factor. The MFA mechanisms in widespread use today are not equally strong.
The main MFA mechanisms, in roughly increasing order of phishing resistance, are:
- SMS one-time codes — a six-digit code texted to the user's phone. Easy to use, phishable (the user types the code into an attacker's fake site), and vulnerable to SIM-swap attacks where the attacker convinces the carrier to port the number to their device.
- TOTP authenticator apps (Google Authenticator, Authy, 1Password) — a six-digit code generated on the device from a shared secret and the current time, per RFC 6238. Not susceptible to SIM swap, but still phishable.
- Push-notification MFA (Duo, Microsoft Authenticator) — the IdP sends an approval prompt to the user's enrolled device. The user taps yes/no. Vulnerable to MFA fatigue (also called prompt bombing) where attackers spam prompts until the user taps yes; mitigated by number matching.
- FIDO2 / WebAuthn — a hardware key or device-bound key signs a challenge that includes the relying party's origin. The signature only validates against the legitimate site, so a phishing site that proxies the challenge cannot get a usable signature. This is phishing-resistant MFA.
4.3 Biometric Authentication¶
Biometric authentication uses something you are: a fingerprint, face geometry, voice, iris, or behavioral pattern. Biometrics are convenient and increasingly accurate, but they have properties no other factor shares. Biometrics cannot be reset — if your fingerprint hash is leaked from a database, you cannot pick a new fingerprint. They are not perfectly secret — your face and fingerprints are on every surface you touch and every photo you appear in. And biometric systems have false-accept and false-reject rates that the operator must tune; the right tradeoff depends on the threat model.
Modern biometric implementations (Apple Face ID, Windows Hello, Android fingerprint) avoid most of the historical pitfalls by keeping the biometric template inside a secure hardware element on the user's device and using it only as a local unlock for a cryptographic key — the biometric itself never leaves the device, and the server never sees it. This pattern is the foundation of FIDO2 and passkeys.
4.4 FIDO2 and Passkeys¶
FIDO2 is the family of standards (W3C WebAuthn for the browser API, FIDO CTAP for the device protocol) that defines public-key authentication for the web. The flow is straightforward: when a user registers with a site, the user's device generates a fresh public/private key pair, sends the public key to the site, and stores the private key in a hardware-protected element (a USB security key like YubiKey, or the secure enclave of a phone or laptop). On future logins, the site sends a random challenge; the device signs the challenge (after a local user-presence test, often a fingerprint or PIN); the site verifies the signature against the stored public key.
A passkey is a FIDO2 credential that is synced across the user's devices by the platform's password manager (iCloud Keychain, Google Password Manager, 1Password, etc.) so the user can log in from any of their devices without re-enrolling. Passkeys are the consumer-friendly packaging of FIDO2 — same cryptography, with credential portability built in.
Two properties make FIDO2/passkeys phishing-resistant. First, the signed challenge includes the relying party's origin (the actual domain of the site the user is on), so a signature obtained on attacker.example is useless against bank.example. Second, the private key never leaves the device, so an attacker who compromises the server only obtains public keys, which are useless for impersonation.
FIDO2's Mental Model Takes A Few Passes
The FIDO2 flow can feel abstract on first read — keys generated where, signed by what, verified against which origin. Run through the registration and login steps once on a real site (most major providers support it now), and the protocol clicks. The payoff is the strongest authentication primitive currently deployed at consumer scale.
Diagram: Authentication Mechanism Strength Comparison¶
Interactive infographic comparing authentication mechanisms across phishing resistance, usability, and recoverability
Type: interactive-infographic
sim-id: auth-mechanism-comparison
Library: p5.js
Status: Specified
A horizontal bar-chart-style infographic with seven authentication mechanisms (rows): Password only, Password + SMS, Password + TOTP, Password + Push MFA, Password + Push MFA with number match, FIDO2 hardware key, Passkey (synced).
For each row, three colored bars (segments) show: phishing resistance (red→green), usability (red→green), and account-recovery cost (green→red, where green means easy recovery and red means hard).
Hovering on a row reveals a tooltip with a one-paragraph summary of the mechanism's strengths, weaknesses, and a representative deployment story (e.g., "TOTP defeats SMS-swap attacks but is still phishable — the user types the code into whatever site asks").
Controls: - Toggle: "Threat model" — switches between opportunistic credential stuffing and targeted spear-phishing views; bar values update. - Toggle: "Show recommended for: consumers / enterprise / high-value accounts."
Learning objective (Bloom level: Evaluating): Students judge the appropriate authentication mechanism for a given threat model and user population, and justify the choice with criteria from the diagram.
Canvas: 800×500 default, responsive. Color palette uses cybersecurity blue (#1565c0) for primary, slate (#455a64) for borders, traffic-light red/yellow/green for the strength bars.
Implementation: p5.js with hover-detection on row regions; data driven by a small JS object so values can be updated without redrawing logic. updateCanvasSize() called first in setup().
5. Federation Protocols: SAML, OAuth 2.0, and OIDC¶
Federation runs on three protocols you will see constantly: SAML, OAuth 2.0, and OIDC. They solve different (overlapping) problems, and confusing them is one of the most common sources of broken IAM designs.
SAML (Security Assertion Markup Language, currently version 2.0) is an XML-based federation protocol that originated in the early 2000s. SAML's primary use case is enterprise web SSO: a user authenticates to their organization's IdP, and the IdP issues a signed XML assertion that the SP verifies. SAML is verbose, hard to debug, and its XML signature handling has been a source of historical vulnerabilities (XML Signature Wrapping, XML External Entity attacks). It is, however, deeply entrenched in enterprise software and remains the default for many B2B integrations.
OAuth 2.0 (RFC 6749 and friends) is an authorization protocol, not an authentication protocol. OAuth 2.0 lets a user grant a third-party application limited access to a resource on their behalf — Spotify wants to read your Google Calendar; do you allow it? — by issuing the third party a short-lived access token scoped to specific permissions. OAuth 2.0 deliberately does not specify how the user's identity is conveyed to the application; it specifies how delegated authorization is conveyed.
OIDC (OpenID Connect) is the identity layer built on top of OAuth 2.0. OIDC adds an ID token — a signed JWT containing claims about the authenticated user (subject, email, issuance time, audience) — to the standard OAuth 2.0 flow, turning OAuth into an authentication protocol. When you click Sign in with Google on a third-party site, you are using OIDC. OIDC is the modern default for new federated authentication designs because it is both well-specified and operationally simpler than SAML.
Before we examine the comparison table, let us anchor each protocol to a typical use case: SAML for enterprise SSO into legacy SaaS apps; OAuth 2.0 for "let this app access my Calendar without giving it my Google password"; OIDC for "let this app log me in using my Google account."
| Protocol | Year | Encoding | Purpose | Common use today |
|---|---|---|---|---|
| SAML 2.0 | 2005 | XML | Authentication + attribute exchange | Enterprise SSO into SaaS apps |
| OAuth 2.0 | 2012 | JSON | Delegated authorization | API access by third-party apps |
| OIDC | 2014 | JSON (JWT) | Authentication on top of OAuth 2.0 | Consumer "Sign in with X"; new SSO designs |
Notice that OAuth 2.0 alone does not tell the application who the user is — only what the application is permitted to do. Many real breaches have come from developers using an OAuth access token as if it were proof of identity. OIDC fixes that by adding the ID token. This distinction is one of the single most important things to get right in a modern authentication design.
6. Social Engineering¶
Defensive engineering can produce a hardened technical stack and still lose to a five-minute phone call. Social engineering is the discipline (from the attacker's perspective) of manipulating people into taking actions or disclosing information that compromise security. Attackers exploit cognitive shortcuts: authority bias (the caller claims to be IT), urgency (the deadline is in 10 minutes), reciprocity (a small favor first), and the natural human desire to be helpful.
The major variants of social engineering you must recognize by name are:
- Phishing — bulk email or messaging that impersonates a trusted entity and lures the recipient to click a link, open an attachment, or enter credentials on a fake site. Bulk phishing trades depth for breadth: the attacker sends millions of messages, accepting a low click-through rate.
- Spear phishing — targeted phishing aimed at a specific person or small group, using research from LinkedIn, company websites, and prior breaches to make the message highly plausible. A spear-phishing email might reference the target's actual project, manager, or recent travel.
- Pretexting — establishing a believable false context (the pretext) to extract information or action over multiple interactions, often by phone. The attacker might pose as an auditor, a help-desk technician, or a vendor's account manager and patiently build rapport before asking for the thing they came for.
- Baiting — leaving a tempting artifact in the target's path: a USB drive in a parking lot, a "free music" download, a too-good-to-miss attachment. Curiosity does the rest.
- Vishing (voice phishing) and smishing (SMS phishing) — the same playbook delivered by phone or text.
- Business Email Compromise (BEC) — a senior-executive impersonation pattern in which the attacker, often after compromising a real account, instructs accounts payable to wire funds to a new vendor or change a payroll deposit account. BEC is one of the most financially damaging attack categories in current FBI IC3 reporting.
Diagram: Phishing Email Anatomy¶
Annotated diagram of a typical spear-phishing email with hover tooltips on each indicator
Type: interactive-infographic
sim-id: phishing-email-anatomy
Library: p5.js
Status: Specified
A rendered mock-up of a phishing email shown in a stylized email-client window. The email impersonates a finance team request to review an invoice. Several hotspots are highlighted with subtle red glow:
- Sender display name — "Finance Department" but the actual address is
finance-dept@finance-securemail.co. Tooltip explains display-name spoofing. - Greeting — generic "Dear Colleague" instead of the recipient's name. Tooltip notes that bulk phishing skips personalization while spear phishing nails it.
- Urgency trigger — "Please review by end of business today to avoid late fee." Tooltip explains how urgency suppresses careful thinking.
- Hyperlink — visible text reads
https://intranet.example.com/invoices/1042but on hover the actual URL ishttps://intranet-example.invoice-portal.ru/login. Tooltip explains the lookalike-domain pattern. - Attachment —
Invoice_Q3.html(an HTML file that opens a credential prompt). Tooltip explains why HTML attachments are a common credential-harvesting vector. - Footer — corporate-looking signature copied from a real company website. Tooltip explains how attackers source these.
A toggle at the top switches between "Spot the indicators" mode (hotspots invisible, user must click to find them; score shown) and "Annotated" mode (all hotspots and explanations visible).
Learning objective (Bloom level: Analyzing): Students decompose a realistic spear-phishing message into its component manipulation techniques and identify the specific signals that should trigger suspicion.
Canvas: 900×600 default, responsive. Soft drop-shadow window styling; cream background for the email body.
Implementation: p5.js with rectangular hit-region detection and a small data file mapping hotspot coordinates to explanation text.
7. Usable Security and Awareness Training¶
If users keep clicking the phishing links, the question is not "how do we punish the users?" but "why is the secure path so much harder than the insecure path?" Usable security is the engineering discipline of designing security controls so that the easy thing to do is also the secure thing to do. The phrase comes from a foundational paper by Whitten and Tygar — "Why Johnny Can't Encrypt" — that documented users' inability to use PGP correctly even with motivation and training. The conclusion was not that users are stupid; it was that PGP's interface forced too many security-critical decisions on people without the context to make them.
Usable security is not a soft topic. It is responsible for some of the largest measurable improvements in real-world security outcomes:
- Browser address-bar redesign with prominent HTTPS / lock-icon indicators reduced victims of basic credential phishing.
- Password managers that auto-fill only on the legitimate domain reduce reuse and break a class of credential-phishing attacks structurally.
- Phishing-resistant MFA (FIDO2) eliminates the user-as-decision-maker for "is this site real?" by tying signatures to origin in the protocol.
- Default-deny patterns (default-private repositories, default-internal calendar visibility) prevent the most common configuration mistakes by making the safe choice the path of least resistance.
Security awareness training is the organizational practice of teaching users to recognize and resist social-engineering attacks. Modern programs are short (5–10 minutes), frequent (monthly), interactive, and paired with simulated phishing campaigns where employees who click receive immediate just-in-time coaching. Training works best when it is calibrated to the actual phishing patterns the organization sees, treats employees as intelligent adults, and is paired with a no-blame, fast-reporting culture — the goal is for an employee who clicks a suspicious link to feel safe reporting it within minutes, not to feel shamed into hiding it.
Train For Reporting, Not Just Recognition
The single most useful behavior you can train for is fast, blameless reporting. The first user who clicks the phishing link is rarely the only one who got the email; if they report within five minutes, you can pull the message from every other inbox. If they hide it for fear of being blamed, you find out from the SOC three days later. Design the report button to be one click and the response to be a thank-you.
8. Insider Threat and Privilege Lifecycle¶
Not all attackers are external. An insider threat is a security risk arising from someone with legitimate access — an employee, contractor, business partner, or anyone whose credentials work by design. Insider threats fall into three rough categories: the malicious insider (deliberate theft or sabotage), the negligent insider (lost laptops, mishandled data), and the compromised insider (a real employee whose account is now in the attacker's hands). The technical controls for the three categories overlap heavily, even though the human dynamics differ.
The pervasive structural problem is that legitimate access tends to grow. Privilege creep, introduced earlier, is the slow accumulation of unneeded entitlements over a career. The combination of long-tenured employees and broad standing access means that any single account compromise can have an enormous blast radius.
Two structural controls reduce this. The first is periodic access review and attestation — every quarter (or more often for sensitive systems), the IAM system produces a list of each user's entitlements, and managers must explicitly confirm each one is still required. Anything not affirmed is removed. The second is just-in-time access — instead of granting standing access to a sensitive resource, the user requests access for a specific task and a specific time window, the request is approved (sometimes automatically, sometimes by a human approver), and the access is automatically revoked when the window expires.
Just-in-time access is the structural answer to "what is the minimum standing privilege everyone needs?" The answer should be as little as possible, with the rest granted on demand. This is least privilege expressed in the time dimension — and it dramatically shrinks the blast radius of any compromise, because most accounts at most times have only the entitlements needed for routine work.
User behavior analytics (UBA), sometimes called UEBA when it includes entities (servers, service accounts), is a complementary detective control. UBA systems build behavioral baselines for each user — typical login hours, typical resources accessed, typical data volumes — and surface deviations. A finance user who suddenly downloads the entire customer database at 2 AM from a country they have never logged in from looks anomalous against their own baseline, even if every individual action is technically permitted. UBA is most useful for catching the compromised-insider case, where the attacker is using legitimate credentials but with abnormal patterns.
9. Privacy Engineering¶
Security and privacy overlap heavily but are not identical. Security asks who can access this data? Privacy asks should we have collected this data at all, and do the people it concerns understand and consent to how it is used? Privacy engineering is the discipline of building systems that handle personal data in ways that respect those questions, by default and at every stage of the data lifecycle.
The foundational stance is privacy by design, articulated by Ann Cavoukian and now codified in regulations such as the EU GDPR. Privacy by design holds that privacy protections should be built into systems from the earliest design phase rather than bolted on after launch, that privacy should be the default setting, and that the system should work for the user without forcing them to actively defend their own privacy.
The most powerful single practice within privacy by design is data minimization. Data minimization means collecting and retaining only the data that is genuinely required for the system's stated purpose, and only for as long as it is needed. The logic is plain: data you do not collect cannot be breached, cannot be subpoenaed, cannot be misused by a future feature, and cannot be fed into a downstream analytics pipeline you have not yet built. Every field on a registration form, every log line written to disk, every metric exported to a third-party analytics service is a privacy decision.
Privacy engineering, like usable security, is most effective when it is structural. Some examples:
- Ask only for the data the feature requires (no shipping address on a digital-only purchase).
- Default to short retention windows; require an explicit business justification for longer.
- Hash, salt, or tokenize identifiers used for analytics so they cannot be linked back to the individual.
- Architect data access through APIs that return only the fields the caller is authorized to see, rather than handing out full records.
- Prefer client-side processing over server-side when feasible (the data never leaves the user's device).
- Distinguish between purpose limitation (data collected for X is not silently reused for Y) and consent (the user agreed to X in the first place); both are required.
Diagram: Privacy Engineering Decision Tree¶
Decision flow for evaluating whether to collect, retain, or share a piece of personal data
Type: workflow-diagram
sim-id: privacy-decision-tree
Library: Mermaid
Status: Specified
A top-down decision tree that an engineer or privacy reviewer walks through for each proposed data field.
Root: "Do we need this data for a specific stated purpose?"
- No → "Do not collect."
- Yes → "Can we satisfy the purpose with less identifying data (aggregated, hashed, derived)?"
- Yes → "Use the less-identifying form."
- No → "What is the minimum retention period to satisfy the purpose?"
- Define retention window → "Is the data shared with third parties?"
- No → "Document purpose, retention, and access controls. Approve."
- Yes → "Is there a data-processing agreement and a lawful basis for sharing?"
- No → "Stop. Do not share."
- Yes → "Document third-party scope, purpose, and DPA. Approve with annual review."
Each leaf node has a small annotation linking to the relevant GDPR/CCPA principle (data minimization, purpose limitation, lawful basis, retention limits).
Color: cybersecurity blue for decision nodes, slate for outcome leaves, alert orange for "Do not collect" / "Stop" leaves. Responsive: collapses to a sequential checklist on narrow viewports.
Implementation: Mermaid graph TD with custom node classes.
The convergence of privacy and security engineering is one of the major shifts in the field over the last decade. Regulations such as GDPR (2018), CCPA (2020), and the growing patchwork of state laws have made privacy a board-level concern with material penalties. The good news is that the technical controls — strong access management, least privilege, encryption in transit and at rest, careful logging — are largely the same controls security teams already build. The discipline that privacy engineering adds is the relentless habit of asking do we need this data at all? before any of the other controls become relevant.
What You Can Now Do
You can now reason about the human layer as a designed system: identity lifecycles, authentication mechanisms ranked by phishing resistance, federation protocols and which one to reach for, the social-engineering playbook adversaries actually use, usable-security and privacy-engineering as structural answers rather than training answers, and the lifecycle controls that shrink blast radius. That is the foundation we will build the organizational-security chapter on next.
10. Key Takeaways¶
The themes of this chapter compose into a small set of operational principles you can carry into design reviews and incident analyses:
- Identity is infrastructure. The IAM system is the most security-critical service in the organization. Treat it accordingly.
- Concentrate trust deliberately. SSO and federation concentrate risk in the IdP; that is the right tradeoff if and only if the IdP gets the security investment it deserves.
- Authentication strength is a spectrum, not a checkbox. Password + SMS is not the same as password + FIDO2. Match the mechanism to the threat model and the value of the asset.
- Phishing-resistant MFA is the structural answer. FIDO2 / passkeys remove the user from the is this site real? decision by tying signatures to origin in the protocol.
- Social engineering exploits cognition, not technology. The defense is structural — usable security, default-deny, fast blameless reporting — not just training.
- Privilege creeps; design for it. Just-in-time access, periodic attestation, and UBA shrink blast radius even when accounts are compromised.
- Privacy by design is data minimization first. Data you never collected cannot be breached, leaked, or subpoenaed.
In the next chapter we move from the human layer to the organizational layer: governance, risk, compliance, and the management systems that put the controls of the last twelve chapters into a defensible program.