Genetic Algorithms

A Genetic Algorithm (GA) is a population-based stochastic search method that imitates natural selection: maintain a population of candidate solutions (“chromosomes”), score each by a fitness function, then evolve the next generation through three operators — selection, crossover, and mutation. Over generations the average fitness rises and (with luck) the best individual approaches the optimum.

GAs were developed in the 1970s (Holland, 1975) and dominated optimization research for decades. Today they’re niche — but they survive in problems where gradients aren’t available and where creative recombination matters (see Where GAs are still used today).

When to reach for a GA

A GA is worth considering when:

The search space is huge and discrete (combinatorial)
The objective is non-differentiable or expensive to evaluate
You can score a candidate but can’t compute the gradient
You don’t need the optimum — a good solution is enough
You can encode candidate solutions as fixed-length strings (chromosomes)

If you have gradients → use gradient descent. If you have a probabilistic model → Bayesian Optimization. If linear/integer → Gurobi/CPLEX. A GA is the “no math required” default for ugly discrete problems.

Worked example: the knapsack problem

You’re packing for a mountain trip. Your backpack holds 40 L. You have 8 items, each with a volume and a value (how useful it would be on the trip):

Item	Volume (L)	Value
Tent	18	10
Sleeping bag	12	9
Stove	4	6
Food (3 days)	8	8
Water filter	2	5
Camera	3	4
Book	1	2
First-aid kit	2	7

Goal: maximize total value while keeping total volume ≤ 40 L.

Encoding as a chromosome

Represent each candidate packing list as an 8-bit binary string, one bit per item:

1 = take this item
0 = leave it behind

[1, 1, 1, 1, 1, 0, 0, 1] → tent + sleeping bag + stove + food + water filter + first-aid kit. Volume = 18+12+4+8+2+2 = 46 L → invalid (over 40 L).

[1, 1, 0, 1, 1, 0, 0, 1] → 18+12+8+2+2 = 42 L → still over.

[1, 1, 0, 1, 0, 0, 0, 1] → 18+12+8+2 = 40 L ✓, value = 10+9+8+7 = 34.

The full search space has 2⁸ = 256 candidates — small enough to brute-force, but the GA scales to 2⁵⁰ items where brute force fails.

Encoding the fitness function

🐍 Code anzeigen / ausblenden

volumes = [18, 12, 4, 8, 2, 3, 1, 2]
values  = [10,  9, 6, 8, 5, 4, 2, 7]
 
def fitness(chromosome, capacity=40):
    vol = sum(g * v for g, v in zip(chromosome, volumes))
    val = sum(g * v for g, v in zip(chromosome, values))
    if vol > capacity:
        return 0          # penalize invalid: simplest valid solution
    return val

Now any standard GA loop works on it. This is what makes GAs flexible: swap the fitness function, keep the algorithm.

The algorithm in 5 steps

1. INITIALIZE — generate a random population of N chromosomes
2. EVALUATE   — compute fitness of every chromosome
3. SELECT     — pick parents proportional to fitness
4. RECOMBINE  — apply crossover + mutation to produce children
5. REPLACE    — children become the new population
   → repeat 2–5 until termination (time / generations / plateau)

That’s it. The genius is that no part requires gradients or domain knowledge — only a way to score candidates.

The three operators

Selection — who gets to reproduce

Fitness-proportional (roulette wheel) — probability ∝ fitness. Simple but breaks when fitness values are very close (no selection pressure) or when one “freak” dominates.
Tournament selection — pick k chromosomes at random, keep the best. More robust to fitness scaling. Most common in practice.
Rank selection — assign probabilities by rank (not raw fitness). Avoids domination by outliers.
Elitism — unconditionally keep the top e chromosomes. Guarantees best fitness is monotone non-decreasing.

Crossover — combining good chunks

Takes two parents → produces two children. Variants:

Single-point — pick a random index, swap suffixes between parents.
Multi-point — multiple swap points; more mixing.
Uniform — for each gene independently, copy from either parent (50/50). Maximum mixing.

Crossover does the heavy lifting — it recombines good chunks from different solutions into novel combinations. But it can only mix genes that already exist in the parent pool.

Mutation — introducing genuinely new genes

For bit strings: flip each bit with probability μ.

Too low → population gets stuck (no new genetic material)
Too high → algorithm degenerates into random search
Typical: μ ≈ 1 / chromosome_length (i.e. expect ~1 mutation per chromosome)

Mutation is the only mechanism that can introduce a gene value not currently present anywhere in the population. Without it, evolution halts at whatever gene combinations existed at initialization (see also Q108 in Questions for Methods of AI).

Demo: 6 GA variants side-by-side on the OneMax problem

Instead of one run we sweep 6 parameter configurations and plot them on a 2×3 grid + a comparison overlay. Each curve is averaged across 5 random seeds (shaded band = ±1 std) so you see the characteristic behavior of each setting, not a single anecdotal run.

Problem: OneMax with chrom_len = 20 (count the 1s in a 20-bit string; max fitness = 20). Harder than the chrom_len=8 toy — differences between configurations become visible.

🐍 Code anzeigen / ausblenden

# Pyodide environment (Obsidian Execute Code plugin) — install matplotlib first.
# If you run in normal Python, delete the next 2 lines.
import micropip
await micropip.install("matplotlib")
 
import random
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
 
# ════════════════════════════════════════════════════════════════
# 🎛️  Try changing these
# ════════════════════════════════════════════════════════════════
CHROM_LEN   = 20                # harder than 8 → variant differences are visible
GENERATIONS = 80
SEEDS       = [0, 1, 2, 3, 4]   # average over 5 runs per variant
# ════════════════════════════════════════════════════════════════
 
# --- GA operators ---
def fitness(chromosome):
    return sum(chromosome)                                      # OneMax
 
def select(population, k=2):
    # +0.5 makes weights non-zero even when a chromosome has fitness 0
    weights = [fitness(c) + 0.5 for c in population]
    return random.choices(population, weights=weights, k=k)
 
def crossover(p1, p2):
    point = random.randint(1, len(p1) - 1)
    return p1[:point] + p2[point:], p2[:point] + p1[point:]
 
def mutate(chromosome, rate):
    return [1 - g if random.random() < rate else g for g in chromosome]
 
def run_ga(pop_size, mutation_rate, elitism, chrom_len=CHROM_LEN,
           generations=GENERATIONS, seed=0):
    random.seed(seed)
    pop = [[random.randint(0, 1) for _ in range(chrom_len)] for _ in range(pop_size)]
    history = []
    for gen in range(generations + 1):
        fits = [fitness(c) for c in pop]
        history.append((gen, max(fits), sum(fits) / len(fits), min(fits)))
        ranked = sorted(pop, key=fitness, reverse=True)
        new_pop = ranked[:elitism]                              # elitism: keep top-N untouched
        while len(new_pop) < pop_size:
            p1, p2 = select(pop)
            c1, c2 = crossover(p1, p2)
            new_pop.extend([mutate(c1, mutation_rate), mutate(c2, mutation_rate)])
        pop = new_pop[:pop_size]
    return history
 
# --- 6 variants — each isolates one trade-off ---
VARIANTS = [
    ("① Baseline\npop=30, elitism=2, μ=0.05",     dict(pop_size=30,  mutation_rate=0.05, elitism=2)),
    ("② No elitism\npop=30, μ=0.05",               dict(pop_size=30,  mutation_rate=0.05, elitism=0)),
    ("③ μ=0.00 (no mutation)\npop=30, elitism=2",  dict(pop_size=30,  mutation_rate=0.00, elitism=2)),
    ("④ μ=0.30 (too high)\npop=30, elitism=2",     dict(pop_size=30,  mutation_rate=0.30, elitism=2)),
    ("⑤ Tiny pop=5\nelitism=1, μ=0.05",            dict(pop_size=5,   mutation_rate=0.05, elitism=1)),
    ("⑥ Large pop=200\nelitism=10, μ=0.05",        dict(pop_size=200, mutation_rate=0.05, elitism=10)),
]
 
# Run all variants × all seeds
all_results = {}
for label, params in VARIANTS:
    runs = []
    for seed in SEEDS:
        hist = run_ga(seed=seed, **params)
        g, b, m, w = zip(*hist)
        runs.append((np.array(g), np.array(b), np.array(m), np.array(w)))
    all_results[label] = runs
 
# --- Plot: 2×3 grid of variants + wide comparison panel below ---
fig = plt.figure(figsize=(16, 11))
gs = gridspec.GridSpec(3, 3, height_ratios=[1, 1, 1.3], hspace=0.6, wspace=0.3)
 
for idx, (label, _) in enumerate(VARIANTS):
    row, col = idx // 3, idx % 3
    ax = fig.add_subplot(gs[row, col])
    runs   = all_results[label]
    gens   = runs[0][0]
    bests  = np.array([r[1] for r in runs])
    means  = np.array([r[2] for r in runs])
    worsts = np.array([r[3] for r in runs])
 
    # Shaded ±std bands
    ax.fill_between(gens, bests.mean(0) - bests.std(0),  bests.mean(0) + bests.std(0),
                    color='crimson', alpha=0.15)
    ax.fill_between(gens, means.mean(0) - means.std(0),  means.mean(0) + means.std(0),
                    color='steelblue', alpha=0.15)
    ax.fill_between(gens, worsts.mean(0) - worsts.std(0), worsts.mean(0) + worsts.std(0),
                    color='gray', alpha=0.12)
 
    ax.plot(gens, bests.mean(0),  color='crimson',   lw=2.0, label='best')
    ax.plot(gens, means.mean(0),  color='steelblue', lw=1.5, label='mean')
    ax.plot(gens, worsts.mean(0), color='gray',      lw=1.0, ls='--', label='worst')
    ax.axhline(CHROM_LEN, color='green', ls=':', alpha=0.6)
    ax.set_title(label, fontsize=9.5)
    ax.set_xlabel('generation', fontsize=8.5)
    ax.set_ylabel('fitness',    fontsize=8.5)
    ax.set_ylim(0, CHROM_LEN + 1)
    ax.grid(alpha=0.3)
    if idx == 0:
        ax.legend(loc='lower right', fontsize=7.5, framealpha=0.95)
 
# Bottom: best-fitness overlay for all 6 variants
ax_cmp  = fig.add_subplot(gs[2, :])
palette = plt.cm.tab10(np.linspace(0, 0.9, len(VARIANTS)))
for (label, _), color in zip(VARIANTS, palette):
    runs  = all_results[label]
    gens  = runs[0][0]
    bests = np.array([r[1] for r in runs])
    short = label.split('\n')[0]                              # ①, ②, …
    ax_cmp.plot(gens, bests.mean(0), color=color, lw=2.4, label=short)
    ax_cmp.fill_between(gens, bests.mean(0) - bests.std(0),
                        bests.mean(0) + bests.std(0), color=color, alpha=0.12)
ax_cmp.axhline(CHROM_LEN, color='green', ls=':', alpha=0.6, label=f'max = {CHROM_LEN}')
ax_cmp.set_xlabel('generation')
ax_cmp.set_ylabel('best-in-population fitness (mean ± std, 5 seeds)')
ax_cmp.set_title('Comparison overlay — best-fitness trajectory across all 6 variants', fontsize=11.5)
ax_cmp.set_ylim(0, CHROM_LEN + 1)
ax_cmp.legend(loc='lower right', ncol=2, fontsize=9, framealpha=0.95)
ax_cmp.grid(alpha=0.3)
 
plt.suptitle(f'Genetic Algorithm — 6 parameter variants on OneMax (chrom_len={CHROM_LEN})',
             fontsize=13.5, y=0.995)
plt.show()
 
# --- Numeric summary ---
print(f"\n📊 FINAL BEST FITNESS (averaged across {len(SEEDS)} seeds — max possible = {CHROM_LEN})\n")
for label, _ in VARIANTS:
    runs   = all_results[label]
    finals = np.array([r[1][-1] for r in runs])
    print(f"  {label.splitlines()[0]:32s}: {finals.mean():5.2f} ± {finals.std():.2f}")

What each variant teaches

#	Variant	Characteristic behavior	Lesson
①	Baseline — pop=30, elitism=2, μ=0.05	Best reaches max quickly and stays; mean plateaus 1–2 below max	Healthy GA: selection lifts mean, elitism locks the best, mutation prevents stagnation. The 1–2 gap between best and mean is the mutation–selection equilibrium.
②	No elitism	Red curve wobbles — best can drop between generations	Elitism is the only thing that guarantees monotone improvement. Without it, a great chromosome can be destroyed by one bad crossover or mutation.
③	μ = 0.00 (no mutation)	Best plateaus wherever the initial population happened to sit; never reaches max	Mutation is the only operator that can introduce a gene value not present in the initial population. Without it, evolution = pure recombination of the founder gene pool.
④	μ = 0.30 (too high)	Best occasionally reaches max but unstable; mean stuck around ~10–12 (random-bitstring expectation)	Too much mutation overwrites everything selection just built. The GA degenerates into random search.
⑤	Tiny pop = 5	Curves very noisy across seeds (wide std band); often fails to converge	Small populations have high stochasticity — selection signal drowns in noise. Roulette wheel can repeatedly pick poor chromosomes by luck.
⑥	Large pop = 200	Smoothest curves; mean climbs highest; converges most reliably to max	Big populations = stable statistics + more diversity = nearly perfect convergence. Cost: 200/5 = 40× more fitness evaluations per generation.

How to read the overlay (bottom panel)

Variants ①, ⑥ climb fastest and stay at max — properly balanced GAs do their job.
Variant ⑤ (tiny pop) has a wide std band → some seeds get lucky, others don’t.
Variant ③ (no mutation) plateaus visibly below the max, no matter how long you run it.
Variant ④ (μ=0.30) has the most chaotic trajectory — too much randomness is fundamentally different from too little.

The take-away in one sentence

A good GA balances three forces: selection pressure (push toward fit), mutation (inject novelty), diversity preservation (avoid collapse). Variants ②–⑤ each break one of these and show what happens.

Why a small population fails (the classic Max-result trap)

Run with pop_size=6, generations=10, elitism=0 (the parameters of an earlier toy version) and you typically see:

gen 0:  [3, 3, 4, 4, 5, 7]   mean 4.33, max 7
gen 10: [3, 5, 6, 6, 7, 7]   mean 5.67, max 7  ← max did NOT improve!

Three failure modes visible at once:

No elitism → the best chromosome (fitness 7) can be destroyed by a single bad crossover. Best fitness flat-lines or even drops.
Tiny population (6) → fitness-proportional selection is highly stochastic. A garbage chromosome (fitness 3) can survive 10 generations purely by luck.
Too few mutation events → with pop=6, gens=10, rate=5%, you expect only ~5 mutation flips per generation across the whole population. Half are 1→0 (counterproductive). Not enough material to introduce missing 1s.

Fix recipe (in order of impact):

Add elitism → guarantee best fitness is monotone
Increase population to ~20–50 → less stochastic selection, more diversity
Increase generations to ~50+ → enough time to converge
Tune mutation rate — too low = stuck; too high = random search

Key parameters and their trade-offs

Parameter	Too low	Too high	Typical
Population size	High stochasticity; diversity collapses fast	Slow per generation; many wasted evaluations	20–200
Generations	Doesn’t converge	Wasted compute after plateau	50–500
Mutation rate	Population freezes (no new alleles)	Becomes random search	≈ 1 / chromosome_length
Elitism count	Best can be lost	Premature convergence	1–5 % of population
Selection pressure (tournament k)	Slow convergence	Diversity collapses	k = 2–5

Where GAs are still used today

GAs from the 1970s (Holland) are alive where mutation + crossover deliver something gradient-based methods cannot.

NASA ST5 antenna (2006) — the iconic example. Lohn et al. evolved a tiny S-band antenna for the ST5 satellite using a GA. The result was a bizarrely bent wire shape no human would design — and it outperformed conventional designs.
Neural Architecture Search (NAS) — Google’s AmoebaNet (Real et al., 2018) evolved CNN architectures that beat human-designed ones on ImageNet. An entire research direction in AutoML.
Game AI / NEAT — NeuroEvolution of Augmenting Topologies evolves neural networks for game agents. Famous via “MarI/O” (Super Mario), used in research and indie games.
Drug discovery — molecular structure generation uses GAs to evolve candidate compounds with desired pharmacological properties.
Industrial scheduling — job-shop scheduling in semiconductor fabs and other manufacturing lines uses GA hybrids.
Wind-farm layout — turbine placement to maximize energy while minimizing wake interference is a GA classic.
Quant trading — hedge funds use GAs to tune parameter sets across thousands of trading strategies.

Why GAs survive: when the search space is huge and discrete and no useful gradient exists — crossover provides a combination operator that can recombine partial solutions. Especially useful when “creativity” matters (antenna design, neural architectures, molecular structures).

Where GAs were replaced — and by what

Domain	Was GA, now …	Why
Neural network training (Edelman 1980s, NEAT until ~2010)	Backpropagation + SGD / Adam	Once the function is differentiable, gradients are exponentially more informative than blind mutation. GA-as-NN-trainer is now nostalgia.
Hyperparameter tuning	Bayesian Optimization → Optuna / Ray Tune	BayesOpt is more sample-efficient; GAs often need 100× more evaluations
Reinforcement learning policy search	PPO, SAC, A3C (Policy Gradients)	Policy-gradient RL uses reward signals directly, while GAs see only the final fitness
Neural Architecture Search	DARTS (Differentiable Architecture Search, Liu 2019)	DARTS makes architecture search differentiable → 1000× faster than evolutionary NAS
Drug discovery (molecular generation)	Diffusion models & SMILES Transformers	Generative models learn the distribution of “good” molecules from data instead of mutating blindly
Chess / Go strategy	AlphaZero (MCTS + DNN)	Self-play with a neural policy function beats evolved strategies by light-years

Where GAs still stand alone: when (a) the function is not differentiable, (b) domain knowledge is hard to encode in a generative model, and (c) the problem is genuinely discrete-combinatorial (e.g. antenna geometry, layout problems). The NASA ST5 antenna is exactly such a case — no differentiable formulation possible.

Brain Online

Explorer

Genetic Algorithms

Genetic Algorithms

When to reach for a GA

Worked example: the knapsack problem

Encoding as a chromosome

Encoding the fitness function

The algorithm in 5 steps

The three operators

Selection — who gets to reproduce

Crossover — combining good chunks

Mutation — introducing genuinely new genes

Demo: 6 GA variants side-by-side on the OneMax problem

What each variant teaches

How to read the overlay (bottom panel)

Why a small population fails (the classic Max-result trap)

Key parameters and their trade-offs

Where GAs are still used today

Where GAs were replaced — and by what

See also

Backlinks

Mika

✨ Features

⚙️ Einstellungen

📚 Chat-Verlauf

📖 Citation Manager

✍️ Writing Assistant

Inhaltsverzeichnis