Skip to content

Genetic Mapping and Linkage Analysis

Summary

This chapter introduces the principles and techniques of genetic mapping, from basic linkage concepts to constructing genetic maps using recombination frequencies. Students learn to perform two-point and three-point crosses, use molecular markers, and understand recombination hotspots and gene conversion. After completing this chapter, students will be able to calculate map distances and construct genetic maps from crossing data.

Concepts Covered

This chapter covers the following 22 concepts from the learning graph:

  1. Genetic Linkage
  2. Recombination
  3. Crossing Over
  4. Recombination Frequency
  5. Genetic Map
  6. Map Distance
  7. Centimorgan
  8. Two-Point Cross
  9. Three-Point Cross
  10. Interference
  11. Coefficient of Coincidence
  12. Gene Order Determination
  13. Genetic Markers
  14. Molecular Markers
  15. Restriction Fragment Length
  16. Microsatellite Markers
  17. SNP Markers
  18. Recombination Hotspots
  19. Sex Differences in Mapping
  20. Mitotic Recombination
  21. Gene Conversion
  22. Intragenic Recombination

Prerequisites

This chapter builds on concepts from:


Welcome to Genetic Mapping!

Dottie waving welcome Welcome, fellow investigators! In this chapter we explore one of genetics' greatest achievements — determining where genes sit along chromosomes. Let's look at the evidence!

Why Genes Don't Always Sort Independently

Mendel's principle of independent assortment predicts that alleles at different loci segregate independently during gamete formation. This prediction holds beautifully when genes reside on different chromosomes. However, organisms have far more genes than they have chromosomes. Humans, for example, carry roughly 20,000 protein-coding genes distributed across only 23 pairs of chromosomes. Genes located on the same chromosome tend to be inherited together — a phenomenon called genetic linkage.

Genetic linkage occurs when two or more loci are physically close together on the same chromosome. Linked genes do not assort independently because meiotic segregation moves entire chromosomes — and the genes they carry — into gametes as a unit. The closer two genes are to each other, the more strongly they are linked, and the more frequently their alleles travel together from parent to offspring.

The departure from independent assortment was first documented systematically in Drosophila melanogaster by Thomas Hunt Morgan and his students in the early 1910s. They noticed that certain pairs of traits were inherited together far more often than the 50:50 ratio expected under independent assortment. This observation formed the basis of chromosome theory and opened the door to genetic mapping.

Recombination and Crossing Over

Although linked genes tend to travel together, they do not always do so. During prophase I of meiosis, homologous chromosomes pair up and exchange segments of DNA through a process called crossing over. Crossing over is the physical breakage and rejoining of chromatid segments between homologous chromosomes. Each crossover event creates a new combination of alleles on the affected chromatids.

Recombination is the broader term for any process that produces offspring with allele combinations different from those found in either parent. Crossing over is the primary mechanism of recombination in eukaryotes. When a crossover occurs between two linked loci, the resulting gametes carry recombinant chromosomes — chromosomes that have a mix of alleles from both parental homologs.

The frequency of crossing over between two loci is directly related to the physical distance separating them. Loci that are far apart on the same chromosome experience crossovers so frequently that they appear to assort independently, with a recombination frequency approaching 50%. Loci that are very close together rarely experience a crossover between them and are almost always inherited together.

Relationship Between Loci Recombination Frequency Observed Inheritance Pattern
Different chromosomes 50% Independent assortment
Same chromosome, far apart Approaches 50% Near-independent assortment
Same chromosome, moderately linked 5%–45% Partial linkage
Same chromosome, tightly linked < 5% Strong co-inheritance

Recombination Frequency and Map Distance

The recombination frequency (RF) between two loci is the proportion of recombinant offspring out of total offspring in a genetic cross. It is calculated as:

\[ \text{RF} = \frac{\text{Number of recombinant offspring}}{\text{Total offspring}} \times 100\% \]

Recombination frequency provides a measure of the genetic distance between two loci. Geneticists convert RF values into map distance, expressed in units called centimorgans (cM). One centimorgan equals a 1% recombination frequency. The unit is named after Thomas Hunt Morgan, who pioneered genetic mapping in Drosophila.

A genetic map is a diagram showing the linear arrangement of genes along a chromosome, with distances between genes expressed in centimorgans. Unlike a physical map that measures distance in base pairs, a genetic map reflects the probability of recombination. Regions with high crossover rates appear "stretched" on genetic maps, while regions with low crossover rates appear compressed.

RF Has a Ceiling

Dottie thinking Remember, recombination frequency can never exceed 50%, even for genes very far apart on the same chromosome. Multiple crossovers between distant loci can cancel each other out, making distant linked genes look unlinked.

Two-Point Crosses

A two-point cross is the simplest method for measuring recombination frequency between two loci. To perform a two-point cross, a researcher crosses an organism heterozygous at both loci (a dihybrid) with a homozygous recessive tester. This testcross reveals the gametes produced by the dihybrid parent because the tester contributes only recessive alleles.

Consider a cross involving two linked genes in Drosophila: body color (b+ = wild-type gray, b = black) and wing shape (vg+ = wild-type long, vg = vestigial). A fly heterozygous for both loci in cis configuration (b+ vg+ / b vg) is crossed to a homozygous recessive fly (b vg / b vg).

Example Two-Point Cross Data:

Phenotype Class Genotype Count Type
Gray body, long wings b+ vg+ / b vg 812 Parental
Black body, vestigial wings b vg / b vg 788 Parental
Gray body, vestigial wings b+ vg / b vg 97 Recombinant
Black body, long wings b vg+ / b vg 103 Recombinant
Total 1800

The recombination frequency is:

\[ \text{RF} = \frac{97 + 103}{1800} \times 100\% = \frac{200}{1800} \times 100\% = 11.1\% \]

The map distance between b and vg is therefore approximately 11.1 cM.

Two-point crosses are straightforward, but they have limitations. When loci are far apart, double crossovers between them can go undetected because the two crossover events cancel each other out, restoring the parental arrangement. This leads to an underestimate of the true map distance.

Three-Point Crosses and Gene Order Determination

A three-point cross (also called a three-factor cross) simultaneously maps three linked loci and is far more informative than a series of two-point crosses. It involves a testcross of a trihybrid organism and allows detection of double crossovers that two-point crosses miss.

In a three-point cross, the offspring are classified into eight phenotypic classes. These classes can be grouped into four categories based on where crossovers occurred.

The key to gene order determination in a three-point cross is identifying the double crossover class. The double crossover class is always the least frequent because it requires two independent crossover events. By comparing the double crossover class with the parental classes, the gene in the middle can be identified — it is the gene whose alleles are switched relative to the parental arrangement.

Three-Point Cross Procedure:

  1. Identify the two most frequent classes — these are the parental classes
  2. Identify the two least frequent classes — these are the double crossover classes
  3. Compare the double crossover classes to the parental classes to find which gene has switched — that gene is in the middle
  4. Calculate single crossover frequencies for each interval
  5. Calculate the double crossover frequency
  6. Add the double crossover frequency to each interval to get corrected map distances

Diagram: Three-Point Cross Analysis

Three-Point Cross Analysis

Type: Interactive educational diagram sim-id: three-point-cross
Library: p5.js
Status: Specified

An interactive diagram showing a three-point cross with three linked genes. The user can toggle crossover events in region I and region II to see how different crossover combinations produce the eight offspring classes. Color-coded chromatids show parental vs. recombinant segments. A data table updates dynamically to show expected offspring counts and how map distances are calculated from the results.

Interference and the Coefficient of Coincidence

When we calculate expected double crossover frequency by multiplying the two single-interval recombination frequencies, the observed number of double crossovers is often lower than expected. This phenomenon is called interference — one crossover event inhibits additional crossovers in nearby regions.

The coefficient of coincidence (C.O.C.) quantifies interference by comparing observed to expected double crossover frequencies:

\[ \text{C.O.C.} = \frac{\text{Observed double crossover frequency}}{\text{Expected double crossover frequency}} \]

Interference (I) is then defined as:

\[ I = 1 - \text{C.O.C.} \]
  • If \( I = 1 \): Complete interference — no double crossovers occur
  • If \( I = 0 \): No interference — crossovers occur independently
  • If \( 0 < I < 1 \): Partial interference — fewer double crossovers than expected
  • If \( I < 0 \): Negative interference — more double crossovers than expected (rare)

Calculating Interference

Dottie giving a tip A C.O.C. of 0.4 means you observed only 40% of the double crossovers you expected. The interference is 1 - 0.4 = 0.6, meaning 60% of expected double crossovers were suppressed.

Genetic Markers: Classical and Molecular

A genetic marker is any identifiable locus in the genome whose inheritance can be tracked through crosses or pedigrees. Classical genetic markers include visible phenotypic traits such as eye color, wing shape, or seed texture. These markers are limited by the number of known phenotypic variants and the requirement that the organism express the trait visibly.

Molecular markers revolutionized genetic mapping by providing thousands of trackable loci across any genome, regardless of whether those loci produce visible phenotypes. Molecular markers detect variation at the DNA level directly. Three major classes of molecular markers have transformed genetic mapping.

Restriction Fragment Length Polymorphisms (RFLPs) were among the first molecular markers used for mapping. A restriction fragment length polymorphism arises when a mutation creates or destroys a recognition site for a restriction enzyme, changing the sizes of DNA fragments produced by digestion. RFLPs are detected by Southern blotting and hybridization with labeled probes. Although reliable, they require relatively large amounts of DNA and are labor-intensive.

Microsatellite markers (also called simple sequence repeats or SSRs) consist of short tandem repeats of 1–6 base pairs (e.g., CACACACA). The number of repeat units varies among individuals, creating length polymorphisms that are easily detected by PCR and gel electrophoresis. Microsatellites are highly polymorphic, codominant, and abundant throughout genomes, making them excellent mapping tools.

SNP markers (single nucleotide polymorphisms) are the most abundant class of genetic variation. An SNP is a single base-pair position where two or more alleles exist at appreciable frequency in a population. SNPs occur roughly every 300–1,000 base pairs in the human genome. High-throughput genotyping arrays can type millions of SNPs simultaneously, making them the workhorse of modern genetic mapping.

Marker Type Detection Method Polymorphism Level Throughput Key Advantage
RFLP Southern blot Moderate Low First molecular marker
Microsatellite PCR + electrophoresis High Medium Very informative per locus
SNP Microarray or sequencing Low per locus Very high Genome-wide coverage

Recombination Hotspots and Coldspots

Crossover events are not distributed uniformly across chromosomes. Certain regions, called recombination hotspots, experience crossing over at rates 10 to 1,000 times higher than the genome average. In humans, the protein PRDM9 binds to specific DNA motifs and recruits the recombination machinery, creating hotspots approximately 1–2 kilobases wide.

Conversely, recombination coldspots — regions with very low crossover rates — are found near centromeres and in large blocks of heterochromatin. These regions appear compressed on genetic maps relative to their physical size.

The nonuniform distribution of recombination means that genetic map distance and physical distance are not linearly proportional. A region spanning 1 Mb of DNA might correspond to 5 cM in a recombination hotspot but only 0.1 cM near a centromere. This is an important caveat when interpreting genetic maps.

Sex Differences in Mapping

Sex differences in mapping are well documented across many species. In humans, the female genetic map is approximately 1.5 to 2 times longer than the male genetic map, meaning that women experience more crossover events per meiosis than men. The total female genetic map length is roughly 4,400 cM, compared to approximately 2,700 cM in males.

These differences may reflect fundamental biological distinctions between oogenesis and spermatogenesis. Oocytes spend extended periods arrested in prophase I, potentially allowing more time for crossover formation. Additionally, the distribution of crossovers differs between sexes — males tend to have higher recombination rates near telomeres, while females have more uniform crossover distributions.

Sex-specific recombination rates must be accounted for in genetic mapping studies. A map distance calculated from male meioses will differ from one calculated from female meioses. Many mapping studies report sex-averaged map distances or present male and female maps separately.

Diagram: Sex Differences in Recombination

Sex Differences in Recombination

Type: Comparative visualization sim-id: sex-recombination-differences
Library: Chart.js
Status: Specified

A side-by-side bar chart comparing genetic map distances (in cM) along a human chromosome for males and females. Each bar pair represents a chromosomal interval. A toggle allows users to switch between chromosomes. The total map length is displayed for each sex, highlighting the consistent female-greater-than-male pattern.

Mitotic Recombination

Most recombination in eukaryotes occurs during meiosis, but crossing over can also happen during mitosis. Mitotic recombination is a rare event in which homologous chromosomes exchange segments during the mitotic cell cycle. Although infrequent, mitotic recombination has significant genetic consequences.

When mitotic recombination occurs in a cell heterozygous for a gene, it can produce daughter cells that are homozygous for one allele or the other — a process called loss of heterozygosity (LOH). In genetics, this phenomenon is visible as mosaic patches of tissue with different genotypes. In Drosophila, mitotic recombination can produce twin spots — adjacent patches of mutant tissue on an otherwise wild-type fly.

Mitotic recombination has medical importance because it can uncover recessive tumor suppressor mutations. If a cell is heterozygous for a tumor suppressor gene (one wild-type allele, one mutant), mitotic recombination can produce a daughter cell homozygous for the mutant allele, potentially initiating tumor development. This is one mechanism underlying Knudson's two-hit hypothesis for cancer.

Mitotic vs. Meiotic Recombination

Dottie giving a warning Don't confuse mitotic recombination with meiotic crossing over. Mitotic recombination is rare and produces somatic mosaics. Meiotic crossing over is routine and generates genetic diversity in gametes.

Gene Conversion

Gene conversion is a nonreciprocal transfer of genetic information between homologous sequences during recombination. In a typical reciprocal crossover, both recombinant products are recovered. In gene conversion, one allele is "converted" to match the other, resulting in a non-Mendelian ratio.

Gene conversion was first detected in fungi where all four products of a single meiosis can be recovered (tetrad analysis). In a normal meiosis involving a heterozygous locus (A/a), the expected ratio of alleles in the four spores is 2A:2a. Gene conversion produces aberrant ratios such as 3A:1a or 1A:3a.

The molecular basis of gene conversion involves the mismatch repair system. During recombination, heteroduplex DNA forms — a stretch of double-stranded DNA where the two strands come from different homologs. If the two strands carry different alleles, mismatch repair corrects one strand to match the other. Which allele "wins" the conversion depends on which strand the repair machinery corrects.

Gene conversion has evolutionary significance because it can homogenize sequences among gene family members, spread beneficial alleles, and contribute to GC-biased gene conversion in mammalian genomes.

Intragenic Recombination

Early geneticists assumed that the gene was the smallest unit of recombination — that crossing over could occur between genes but not within them. Seymour Benzer's work on the rII locus of bacteriophage T4 in the 1950s disproved this assumption. Intragenic recombination is recombination that occurs within a single gene, between different mutant sites in the same coding sequence.

Benzer showed that different rII mutations could recombine with each other at extremely low frequencies, demonstrating that genes have internal structure. His fine-structure genetic map of the rII region resolved mutations down to individual nucleotide positions, establishing that the gene is not an indivisible unit.

Intragenic recombination can produce wild-type recombinants from two different mutant alleles of the same gene, as long as the mutations affect different nucleotide positions. This phenomenon is important for understanding complementation tests, where intragenic recombination can occasionally produce wild-type progeny from two mutations in the same gene — a potential source of false negatives.

Diagram: Crossover Types and Scale

Crossover Types and Scale

Type: Hierarchical comparison diagram sim-id: crossover-types-scale
Library: p5.js
Status: Specified

An interactive visualization showing three scales of recombination: (1) intergenic recombination between genes on a chromosome, (2) intragenic recombination within a gene between mutation sites, and (3) gene conversion as a nonreciprocal event. Users click tabs to zoom between scales. Color-coded segments show parental vs. recombinant DNA, with animated crossover events.

Building a Genetic Map: Putting It All Together

Constructing a comprehensive genetic map requires integrating data from many crosses and marker types. The general workflow involves these steps:

  1. Select informative crosses — choose parents that differ at many marker loci
  2. Genotype offspring — determine the alleles at each marker locus in a large mapping population
  3. Calculate pairwise recombination frequencies — compute RF for every pair of markers
  4. Assign linkage groups — markers with RF significantly below 50% belong to the same linkage group (chromosome)
  5. Order markers within linkage groups — use three-point cross logic or computational algorithms to determine marker order
  6. Estimate map distances — convert RF to cM, applying mapping functions (such as the Haldane or Kosambi function) to correct for multiple crossovers

The Haldane mapping function corrects for undetected double crossovers by assuming no interference:

\[ d = -\frac{1}{2} \ln(1 - 2r) \]

where \( d \) is the corrected map distance and \( r \) is the observed recombination frequency.

The Kosambi mapping function incorporates interference:

\[ d = \frac{1}{4} \ln\left(\frac{1 + 2r}{1 - 2r}\right) \]

Both functions give distances in Morgans (multiply by 100 for centimorgans). For small recombination frequencies (r < 0.1), the corrected and uncorrected distances are nearly identical.

Diagram: Genetic Map Construction Workflow

Genetic Map Construction Workflow

Type: Interactive flowchart with calculation sim-id: map-construction-workflow
Library: p5.js
Status: Specified

A step-by-step interactive diagram where users input recombination data for three loci (A, B, C). The visualization walks through calculating pairwise RF values, determining gene order, correcting for double crossovers, applying mapping functions, and displaying the final genetic map. Sliders allow users to adjust offspring counts and see how the map changes in real time.

Map Building Complete!

Dottie celebrating Excellent work, researchers! You can now calculate recombination frequencies, determine gene order from three-point crosses, and interpret molecular markers. These skills form the foundation for the advanced mapping techniques in the next chapter.

Key Takeaways

  • Genetic linkage keeps alleles on the same chromosome together during inheritance; crossing over disrupts this linkage at rates proportional to physical distance.
  • Recombination frequency between two loci, measured as a percentage, directly converts to map distance in centimorgans.
  • Two-point crosses measure RF between two loci, while three-point crosses simultaneously map three loci and detect double crossovers.
  • Interference reduces double crossovers below expected levels; the coefficient of coincidence quantifies this reduction.
  • Molecular markers (RFLPs, microsatellites, SNPs) provide dense, genome-wide landmarks for constructing high-resolution genetic maps.
  • Recombination hotspots create nonuniform relationships between genetic and physical distances.
  • Sex differences in recombination rates produce different map distances in male vs. female meioses.
  • Mitotic recombination, gene conversion, and intragenic recombination extend recombination beyond the standard meiotic model.