Chapter 10: Learning and Conditioning¶
Summary¶
This chapter surveys the major mechanisms by which organisms learn from experience. Students begin with classical conditioning β Pavlov's original procedure, the key elements (UCS, UCR, CS, CR), and extensions to emotional learning and taste aversion. Operant conditioning follows: Skinner's law of effect, positive and negative reinforcement, positive and negative punishment, shaping, and schedules of reinforcement. The chapter then covers social learning theory (Bandura's observational learning and Bobo doll study), cognitive maps and latent learning, self-efficacy, learned helplessness, and the fixed vs. growth mindset. Throughout, neuroplasticity provides the biological foundation, and the interplay between conditioning principles and cognition updates the purely behaviorist picture.
Concepts Covered¶
This chapter covers the following 28 concepts from the learning graph:
- Classical Conditioning
- Unconditioned Stimulus
- Operant Conditioning
- Neuroplasticity
- Taste Aversion
- Conditioned Stimulus
- Positive Reinforcement
- Negative Reinforcement
- Positive Punishment
- Negative Punishment
- Social Learning Theory
- Conditioned Response
- Acquisition
- Shaping Behavior
- Schedules of Reinforcement
- Positive and Negative Reinforcement Contrast
- Learned Helplessness
- Observational Learning
- Cognitive Maps
- Conditioned Emotional Response
- Extinction
- Stimulus Generalization
- Bobo Doll Study
- Self-Efficacy
- Latent Learning
- Spontaneous Recovery
- Stimulus Discrimination
- Fixed vs. Growth Mindset
Prerequisites¶
This chapter builds on concepts from:
- Chapter 1: Foundations of Psychology and Research Methods
- Chapter 2: Biological Bases of Behavior
- Chapter 9: Development: Adolescence Through Adulthood
10.1 Classical Conditioning¶
Mascot-welcome
Welcome to Chapter 10 β where we learn how learning works!
Here's something remarkable: every time you feel your stomach drop when you hear a specific ringtone associated with a stressful call, every time the smell of sunscreen sends you back to a childhood beach, and every time your pulse quickens before a big exam β that's learning at work. You've been conditioned, in the technical sense of the word.
This chapter covers the three major frameworks psychologists use to explain learning: classical conditioning, operant conditioning, and social learning. By the end, you'll understand not just how Pavlov trained dogs to salivate on cue, but how the same principles shape fear, reward-seeking, addiction, and even belief in your own potential.
Let's think about that! π¦
Learning, in psychology, is defined as a relatively permanent change in behavior or knowledge that results from experience. This definition distinguishes learning from temporary states (hunger, fatigue) and from genetic maturation (puberty, walking). The three major traditions for studying learning each emphasize different mechanisms.
Classical conditioning β first systematically studied by Ivan Pavlov β is a learning process in which a neutral stimulus comes to elicit a response after it is repeatedly paired with a stimulus that already elicits that response.
Pavlov's Fundamental Procedure¶
Pavlov was studying digestive physiology in dogs when he noticed that his dogs began salivating not just at the sight of food but at the sight of the laboratory assistant who brought the food β a "psychic secretion" he eventually made the subject of systematic study.
The classical conditioning procedure involves four elements:
| Term | Abbreviation | Definition | Pavlov's Example |
|---|---|---|---|
| Unconditioned Stimulus | UCS | A stimulus that naturally and automatically elicits a response without prior learning | Food |
| Unconditioned Response | UCR | The unlearned, automatic response to the UCS | Salivation to food |
| Conditioned Stimulus | CS | A neutral stimulus that, after pairing with the UCS, comes to elicit a response | Bell |
| Conditioned Response | CR | The learned response to the CS (usually similar to, but not identical to, the UCR) | Salivation to bell |
Acquisition is the initial learning phase in which the CS and UCS are repeatedly paired and the conditioned response gradually strengthens. The most effective pairing has the CS appearing just before the UCS (forward conditioning), with a short time interval between them.
Stimulus Generalization and Discrimination¶
Once a CS has been conditioned, stimulus generalization occurs when stimuli similar to the original CS also elicit the conditioned response. If a dog is conditioned to salivate at a 1,000 Hz tone, it will also salivate to tones of 900 Hz or 1,100 Hz β just less strongly as the similarity decreases. This gradient of responding reflects the organism's limited ability to precisely distinguish similar stimuli.
Stimulus discrimination is the complementary process in which the organism learns to respond to the specific CS but not to other similar stimuli. This is trained by reinforcing responses to the target CS while presenting similar stimuli without the UCS. Discrimination training refines the conditioned response.
Extinction and Spontaneous Recovery¶
Extinction in classical conditioning occurs when the CS is repeatedly presented without the UCS, causing the conditioned response to decrease and eventually disappear. The dog hears the bell repeatedly with no food β and gradually stops salivating. Extinction does not destroy the original learning; it overlays it with new inhibitory learning.
Evidence for this comes from spontaneous recovery: after extinction and a rest period, the extinguished CR reappears (often weaker than the original) when the CS is presented again. This shows the original CSβUCS association was not erased but suppressed.
Mascot-thinking
Psy's Note: Spontaneous recovery has clinical implications. A phobia that seems to be extinguished in therapy (through exposure) can "spontaneously recover" when the patient encounters the feared stimulus after a gap. This is why single rounds of exposure therapy are often not sufficient β repeated exposures across different contexts are needed to prevent return of fear.
Conditioned Emotional Responses¶
John B. Watson and Rosalie Rayner extended classical conditioning to human emotional reactions in the famous (and ethically troubling) "Little Albert" study. They conditioned a nine-month-old infant to fear a white rat by pairing it with a sudden loud noise (UCS for fear/startle). The infant eventually showed fear (CR) to the rat alone β and via generalization, to other white, fluffy stimuli.
A conditioned emotional response is an emotional reaction β fear, pleasure, disgust, nostalgia β that is acquired through classical conditioning. Emotional conditioning is often rapid, powerful, and long-lasting, particularly for fear responses. The amygdala (Chapter 2) is the neural locus of conditioned fear.
Taste Aversion¶
Taste aversion is a specialized form of classical conditioning in which an organism associates the taste of a food (CS) with illness (UCS) and subsequently avoids that food. What makes taste aversion distinctive is that it can occur after a single pairing (violating the usual requirement for repeated pairings) and even when the time interval between eating and illness is several hours (violating the usual contiguity requirement).
John Garcia and Robert Koelling's research showed that rats readily form tasteβillness associations but do not readily form light/soundβillness associations, even with the same training procedure. Conversely, light and sound pair readily with shock, while taste does not. This biological preparedness suggests that evolution has pre-wired certain CSβUCS associations to be especially easy to learn because they were adaptive.
10.2 Operant Conditioning¶
Operant conditioning β systematically developed by B.F. Skinner, building on Thorndike's Law of Effect β is a learning process in which the frequency of a behavior is modified by its consequences. Unlike classical conditioning (which involves reflexive responses), operant conditioning applies to voluntary behavior.
The Law of Effect and Skinner's Extension¶
Edward Thorndike's Law of Effect states: behaviors followed by satisfying consequences are strengthened (stamped in); behaviors followed by unsatisfying consequences are weakened (stamped out). Skinner operationalized this in the Skinner box β an operant chamber in which an animal (rat or pigeon) could perform a measurable behavior (pressing a lever, pecking a key) to receive a consequence (food pellet or shock).
Reinforcement and Punishment¶
Operant consequences are classified along two dimensions: whether they increase or decrease the behavior, and whether a stimulus is added or removed.
| Stimulus Added (+) | Stimulus Removed (β) | |
|---|---|---|
| Increases behavior | Positive Reinforcement | Negative Reinforcement |
| Decreases behavior | Positive Punishment | Negative Punishment |
Positive reinforcement: Adding a desirable stimulus following a behavior, increasing the likelihood of that behavior. A rat receives a food pellet when it presses a lever β lever pressing increases.
Negative reinforcement: Removing an aversive stimulus following a behavior, increasing the likelihood of that behavior. A rat escapes a mild shock by pressing a lever β lever pressing increases. Note: "negative" refers to removal, not to anything bad β negative reinforcement always increases behavior.
Positive punishment: Adding an aversive stimulus following a behavior, decreasing the likelihood of that behavior. Touching a hot stove β burn β stove-touching decreases.
Negative punishment (response cost): Removing a desirable stimulus following a behavior, decreasing the likelihood of that behavior. A teenager's phone is taken away for missing curfew β curfew-missing decreases.
Diagram: Reinforcement vs. Punishment Overview¶
Explore: How do the four operant contingencies work?
The key to remembering these four types:
Positive = Adding something Negative = Removing something Reinforcement = Behavior increases Punishment = Behavior decreases
So the four combinations:
- Positive Reinforcement (+R): Add something good β behavior increases
- Example: Praise after a student answers a question β student participates more
-
The most powerful method for building new behaviors
-
Negative Reinforcement (βR): Remove something bad β behavior increases
- Example: Taking ibuprofen removes headache pain β taking ibuprofen increases
-
Often confused with punishment β remember it always increases behavior
-
Positive Punishment (+P): Add something bad β behavior decreases
- Example: A speeding ticket after driving too fast β speeding decreases
-
Effective but has side effects: fear, aggression, avoidance of the punisher
-
Negative Punishment (βP): Remove something good β behavior decreases
- Example: Taking away screen time when a child misbehaves β misbehavior decreases
- Generally preferred over positive punishment due to fewer side effects
Critical distinction for the AP exam: Negative reinforcement is not punishment. It increases behavior by removing something aversive. Students who confuse this lose points consistently.
Shaping¶
Shaping behavior is the operant conditioning technique of reinforcing successive approximations toward a desired target behavior. Because organisms cannot be directly reinforced for a complex behavior they have never performed, shaping breaks the target into a sequence of increasingly accurate approximations and reinforces each step.
To train a rat to press a lever, a trainer might first reinforce any movement toward the lever, then movement to within one foot of it, then touching it, and finally pressing it with enough force. Each successive approximation replaces the last as the criterion for reinforcement. Shaping is the mechanism behind animal training, language acquisition support, rehabilitation therapy, and many educational interventions.
Schedules of Reinforcement¶
In real environments, behaviors are rarely reinforced every single time they occur. Schedules of reinforcement describe the pattern that determines when a behavior is reinforced.
Continuous reinforcement (every occurrence rewarded) produces the fastest acquisition but the fastest extinction.
Partial (intermittent) reinforcement schedules produce slower acquisition but much greater resistance to extinction β because if reinforcement isn't continuous, how long must the organism persist before concluding it's truly unavailable?
Four basic partial schedules:
| Schedule | Reinforcement Is Delivered... | Response Pattern | Example |
|---|---|---|---|
| Fixed Ratio (FR) | After a fixed number of responses | High, steady rate; pause after each reinforcement | Getting paid per piece of work completed |
| Variable Ratio (VR) | After a variable number of responses (average) | Highest and most persistent rate; no post-reinforcement pause | Slot machines; social media likes |
| Fixed Interval (FI) | After a fixed time period | Low rate early, "scallop" shape β acceleration near interval end | Checking for mail that arrives at a fixed time |
| Variable Interval (VI) | After a variable time period (average) | Steady, moderate rate | Checking social media when posts appear at random times |
Variable ratio schedules produce the most behavior and are most resistant to extinction β this is why gambling can be so compelling and why intermittent social media reinforcement drives constant checking.
Mascot-tip
Psy's AP Exam Tip: Be able to identify each schedule from a description and explain why variable ratio schedules produce the highest, most persistent response rates. The slot machine analogy is the classic example β use it on the exam.
10.3 Beyond Behaviorism: Cognitive Factors in Learning¶
Classical and operant conditioning were developed within the behaviorist tradition β which deliberately excluded unobservable mental states. By the mid-20th century, however, evidence accumulated that mental representations play an important role in learning.
Latent Learning¶
Edward Tolman's experiments with rats in mazes challenged strict behaviorism. In a key experiment, one group of rats ran a maze daily for a reward; another group ran the maze without reward for many days, then started receiving reward. The unrewarded rats had not been reinforced for any correct turns β yet when reward was introduced, they quickly matched the performance of the consistently-rewarded group. This demonstrated latent learning: learning that occurs without reinforcement and is not immediately expressed in behavior, only revealed when a reason to perform arises.
Cognitive Maps¶
Tolman used the term cognitive map to describe the internal mental representation of a spatial environment that the rats had apparently built during their unrewarded explorations. Cognitive maps are not limited to spatial layouts β they extend to mental representations of causeβeffect relationships, social networks, and abstract conceptual spaces. The existence of cognitive maps implies that learning involves forming internal representations, not just SβR (stimulusβresponse) connections.
Learned Helplessness¶
Martin Seligman and Steven Maier's research with dogs illuminated a phenomenon with profound implications for depression. In the original experiments, dogs were exposed to inescapable electric shocks β shocks that continued regardless of any action the dog took. When these dogs were later placed in a different chamber where they could easily escape shocks by jumping over a barrier, most did not even try β they lay passively and accepted the shock.
Learned helplessness is the passive, unresponsive behavior that results from repeated experience with uncontrollable aversive events. The organism "learns" that its responses have no effect on outcomes and generalizes this perception of uncontrollability to new situations. Seligman drew an explicit parallel to human depression: people who experience repeated uncontrollable negative events (job loss, abusive relationships, chronic illness) may develop the cognitive style of attributing negative events to global, stable, internal causes β "I can't do anything right, it's always like this, it's my fault" β which perpetuates helplessness and depression.
10.4 Social Learning Theory¶
Observational Learning¶
Albert Bandura proposed that much of human learning occurs not through direct reinforcement but through observational learning (also called vicarious learning or modeling) β learning by observing the behavior and consequences of others.
Bandura identified four processes required for effective observational learning:
- Attention: The learner must attend to the model's behavior.
- Retention: The learner must encode and remember what was observed.
- Reproduction: The learner must have the physical and cognitive ability to reproduce the behavior.
- Motivation: The learner must want to perform the behavior β often influenced by observing whether the model was reinforced or punished (vicarious reinforcement/punishment).
The Bobo Doll Study¶
Bandura's most famous research is the Bobo doll study (1961). Children watched an adult model either behave aggressively toward an inflatable "Bobo" doll (punching, kicking, using a mallet) or play non-aggressively. Children who observed the aggressive model were significantly more likely to display similar aggression toward the doll when given the opportunity β and they invented novel forms of aggression the model had not demonstrated, showing that observation produced learned behavioral scripts, not just mimicry.
Mascot-thinking
Psy's Note: The Bobo doll study has been enormously influential in debates about media violence. Critics point out that punching an inflatable doll is not the same as harming a person, and that demand characteristics (the laboratory setting "said" it was okay to be aggressive) may have inflated the effect. The relationship between media violence and real-world aggression remains contested in the research literature β more nuanced than either "media causes violence" or "media has no effect."
Self-Efficacy¶
One of Bandura's most influential concepts, self-efficacy is one's belief in one's own capacity to successfully execute a specific behavior or achieve a particular outcome. Unlike self-esteem (a global evaluation), self-efficacy is domain-specific: a person can have high self-efficacy for mathematics and low self-efficacy for drawing.
Self-efficacy is built through four sources:
- Mastery experiences: The strongest source. Successfully performing a task increases self-efficacy; repeated failure decreases it.
- Vicarious experiences: Watching similar others succeed ("If they can do it, I can too") raises self-efficacy.
- Social persuasion: Encouragement from credible others can temporarily boost self-efficacy.
- Physiological and affective states: Interpreting arousal (rapid heartbeat before a presentation) as excitement rather than fear supports self-efficacy.
High self-efficacy predicts persistence in the face of obstacles, higher goal-setting, less anxiety, and greater achievement β across academic, athletic, health behavior, and occupational domains.
10.5 Neuroplasticity and Learning¶
All learning is, at its foundation, a biological change in the nervous system. Neuroplasticity is the brain's lifelong capacity to change its structure and function in response to experience.
At the cellular level, learning involves long-term potentiation (LTP) β repeated activation of a synapse makes that synapse more sensitive and efficient, which is the cellular mechanism underlying the formation of memory traces (covered in Chapters 6β7). Donald Hebb's famous principle β "cells that fire together, wire together" β captures this mechanism.
At the structural level:
- Synaptic pruning: Unused synaptic connections are eliminated, streamlining neural circuits.
- Synaptogenesis: New synaptic connections form in response to learning and novel experience.
- Neurogenesis: New neurons are generated in specific areas (notably the hippocampus) in response to learning and aerobic exercise.
- Cortical reorganization: Following skill acquisition or after injury, the cortical representation of body parts and functions can be dramatically remapped.
Classic examples of plasticity include: the enlarged hippocampi of London taxi drivers who memorize the city's vast street grid; the enlarged cortical representation of the fingers in skilled violinists (left-hand fingers especially); and the recovery of language function after left-hemisphere stroke, which can be mediated by right-hemisphere recruitment.
10.6 Fixed vs. Growth Mindset¶
Carol Dweck's research on achievement motivation has yielded one of the most application-rich frameworks in contemporary psychology: the fixed vs. growth mindset distinction.
A fixed mindset is the implicit theory that ability, intelligence, and talent are fixed, innate traits β you either have them or you don't. Fixed-mindset individuals interpret effort as evidence of inadequacy ("If you have to work hard, you don't have natural ability"), avoid challenges where they might fail, give up quickly after setbacks, and experience failure as threatening to their sense of self.
A growth mindset is the implicit theory that ability can be developed through dedication, effort, and effective learning strategies. Growth-mindset individuals interpret effort as the mechanism of development, seek challenging tasks, persist through setbacks, and experience failure as information rather than threat.
Mindset is not a stable personality trait but a learnable framework. Dweck's intervention research has shown that brief interventions teaching students that the brain is like a muscle β changing with practice β can improve academic outcomes, particularly for students from groups under stereotype threat.
Mascot-encourage
Psy's Encouragement: The growth mindset framework is not just pop psychology β it has real experimental support. But it is most powerful when combined with concrete learning strategies (spaced practice, retrieval practice, interleaving) rather than delivered as a standalone pep talk. "Try harder" without "here's how to practice effectively" is not enough.
The connection to neuroplasticity is explicit: a growth mindset is, in a sense, a psychologically accurate understanding of how learning works. The brain really does change with practice. Fixed-mindset beliefs are, in part, empirically incorrect beliefs about neuroscience.
10.7 Chapter Review¶
Mascot-celebration
Brilliant work completing Chapter 10!
You've covered an enormous amount of ground β from Pavlov's dogs salivating at bells, to Skinner's pigeons on variable ratio schedules, to Bandura's children modeling aggression, to Seligman's dogs lying passive in the face of escapable shock. These aren't just historical anecdotes β they're the foundational building blocks of behavior change, education, clinical psychology, and our understanding of how experience rewires the brain.
Chapter 11 moves from individual learning to social psychology β how the presence of others changes how we think, feel, and act. Get ready for some of the most famous (and most disturbing) experiments in psychology.
Let's think about that! π¦
Key Terms¶
- Classical conditioning: Learning in which a neutral stimulus comes to elicit a response through pairing with a stimulus that already elicits that response.
- Unconditioned stimulus (UCS): A stimulus that naturally elicits a response without prior learning.
- Conditioned stimulus (CS): A previously neutral stimulus that elicits a conditioned response after pairing with the UCS.
- Conditioned response (CR): The learned response to the CS.
- Acquisition: The initial learning phase during CSβUCS pairings.
- Extinction: Decrease of CR when CS is presented repeatedly without UCS.
- Spontaneous recovery: Reappearance of an extinguished CR after a rest period.
- Stimulus generalization: Responding to stimuli similar to the original CS.
- Stimulus discrimination: Responding to the specific CS but not to similar stimuli.
- Taste aversion: Single-trial conditioning of food avoidance following illness; demonstrates biological preparedness.
- Conditioned emotional response: Emotional reactions (e.g., fear) acquired through classical conditioning.
- Operant conditioning: Learning in which behavior is modified by its consequences.
- Positive reinforcement: Adding a desirable stimulus to increase behavior.
- Negative reinforcement: Removing an aversive stimulus to increase behavior.
- Positive punishment: Adding an aversive stimulus to decrease behavior.
- Negative punishment: Removing a desirable stimulus to decrease behavior.
- Shaping: Reinforcing successive approximations toward a target behavior.
- Schedules of reinforcement: Patterns determining when reinforcement is delivered; variable ratio produces highest rates.
- Latent learning: Learning that occurs without reinforcement, revealed only when motivation arises.
- Cognitive maps: Mental representations of spatial or conceptual environments.
- Learned helplessness: Passive behavior resulting from repeated exposure to uncontrollable aversive events.
- Observational learning: Learning by observing others' behavior and consequences.
- Bobo doll study: Bandura's experiment demonstrating that children model adult aggression.
- Self-efficacy: Belief in one's capacity to successfully perform a specific behavior.
- Neuroplasticity: The brain's lifelong ability to change structure and function through experience.
- Fixed mindset: Belief that ability is innate and fixed.
- Growth mindset: Belief that ability can be developed through effort and effective strategies.
Practice Questions¶
-
A child who was bitten by a dog now feels fear (CR) around all large animals β not just dogs. This illustrates __.
-
A factory worker is paid for every 50 units produced. This is a _ schedule of reinforcement and produces a _ rate of responding.
-
A rat that explored a maze without reinforcement for many days quickly learned the correct path when food was introduced. Tolman called the mental representation it had formed a __.
-
In the Bobo doll study, children who observed an adult being __ for aggression were less likely to imitate the aggressive behavior than children who saw no consequence.
-
According to Dweck, a student who believes that intelligence is fixed and interprets needing to study hard as evidence of low ability has a __ mindset.
Show Answers
- Stimulus generalization
- Fixed ratio (FR); high and steady rate (with post-reinforcement pauses)
- Cognitive map
- Punished (vicarious punishment reduced imitation)
- Fixed mindset