Hopfield-Netzwerke sind rekurrente neuronale Netze, die assoziatives Gedächtnis modellieren: Sie speichern Muster als stabile Zustände (Attraktoren) und können unvollständige oder verrauschte Eingaben zum gespeicherten Muster vervollständigen.
Details
Grundprinzip
Vollständig verbundenes rekurrentes Netz mit symmetrischen Gewichten (w_ij = w_ji)
Zustand jedes Neurons: ±1 (klassisch) oder kontinuierlich
Energiefunktion: E = −½ Σ_ij w_ij s_i s_j — das Netz minimiert diese Energie
Stabiler Zustand (Attraktor): lokales Minimum der Energiefunktion = gespeichertes Muster
Lernregel (Hebb’sches Lernen)
w_ij = Σ_μ x_i^μ · x_j^μ (Summe über alle gespeicherten Muster)
Muster, die oft co-aktiv sind, erhalten starke Verbindungen — direkte Umsetzung der Hebb’schen Regel
Kapazitätsgrenze: ca. 0.14 × N Muster (N = Anzahl Neuronen) → bei Überschreitung: falsche Attraktoren
→ Modell für Gedächtnisabruf beim Menschen: Hippocampus als Attraktor-Netzwerk (CA3-Region)
Modern Hopfield Networks (Ramsauer et al., 2020)
Kontinuierliche Version mit exponentieller Interaktion
Exponentiell mehr Kapazität: kann ~2^(N/2) Muster speichern
Äquivalenz zu Transformer Self-Attention: der Update-Schritt eines modernen Hopfield-Netzwerks ist mathematisch identisch mit dem Softmax-Attention-Mechanismus
Interpretiert Transformers als implementierende assoziative Gedächtnissysteme
Neurowissenschaftliche Relevanz
CA3 des Hippocampus: Kollateralverbindungen ermöglichen Mustervervollständigung (Pattern Completion)
CA3 ↔ CA1: Hippocampus unterscheidet Mustervervollständigung (CA3) von Musterdiskriminierung (DG/CA1 → Mustertrennungspair mit Körnerzellen)
Simulation: Mustervervollständigung und Energieabstieg
🐍 Figure — Hopfield recall of a noisy 5×5 glyph + energy descent
import micropipawait micropip.install("numpy")await micropip.install("matplotlib")import matplotlib.pyplot as pltimport numpy as nprng = np.random.default_rng(3)# ---- Three 5x5 binary glyphs (+1 / -1) ----def glyph(rows): a = np.array([[1 if c == "#" else -1 for c in r] for r in rows], dtype=float) return a.flatten()P_X = glyph(["#...#", ".#.#.", "..#..", ".#.#.", "#...#"])P_T = glyph(["#####", "..#..", "..#..", "..#..", "..#.."])P_O = glyph([".###.", "#...#", "#...#", "#...#", ".###."])patterns = np.vstack([P_X, P_T, P_O])N = patterns.shape[1]# ---- Hebbian weights: W = Σ_μ x^μ (x^μ)^T, diagonal zeroed ----W = np.zeros((N, N))for p in patterns: W += np.outer(p, p)W /= Nnp.fill_diagonal(W, 0) # no self-connectionsdef energy(y): return -0.5 * y @ W @ y # E = -0.5 yᵀWy# ---- Corrupt pattern X with noise (flip 8 of 25 bits) ----target = P_X.copy()noisy = target.copy()flip_idx = rng.choice(N, size=8, replace=False)noisy[flip_idx] *= -1# ---- Asynchronous update until convergence, track energy ----def recall_async(W, x, max_sweeps=20, seed=1): r = np.random.default_rng(seed) y = x.copy() energies = [energy(y)] for _ in range(max_sweeps): changed = False for i in r.permutation(N): s = np.sign(W[i] @ y) # async update: sign(Wy) if s == 0: s = 1 if s != y[i]: y[i] = s changed = True energies.append(energy(y)) if not changed: break return y, np.array(energies)recalled, energies = recall_async(W, noisy)# ---- Figure ----fig = plt.figure(figsize=(13, 6))gs = fig.add_gridspec(2, 4, width_ratios=[1, 1, 1, 1.6])def show(ax, vec, title): ax.imshow(vec.reshape(5, 5), cmap="gray_r", vmin=-1, vmax=1) ax.set_title(title, fontsize=10) ax.set_xticks([]); ax.set_yticks([])show(fig.add_subplot(gs[0, 0]), P_X, "stored: X")show(fig.add_subplot(gs[0, 1]), P_T, "stored: T")show(fig.add_subplot(gs[0, 2]), P_O, "stored: O")show(fig.add_subplot(gs[1, 0]), noisy, "noisy input\n(8/25 flipped)")show(fig.add_subplot(gs[1, 1]), recalled, "recalled")show(fig.add_subplot(gs[1, 2]), target, "original X")ax_e = fig.add_subplot(gs[:, 3])ax_e.plot(energies, "o-", color="#8e44ad", ms=4, lw=1.6)ax_e.set_title("Network energy E = −½ yᵀWy\n(monotonically non-increasing)", fontsize=10)ax_e.set_xlabel("async update step (per flipped neuron)")ax_e.set_ylabel("energy E")ax_e.grid(alpha=0.3)fig.suptitle("Hopfield associative memory — recall a stored pattern from noise", fontsize=12, y=1.00)plt.tight_layout(rect=[0, 0, 1, 0.97])plt.subplots_adjust(hspace=0.45)plt.show()
What this shows. Three 5×5 binary glyphs (X, T, O) are stored as attractors via a single Hebbian outer-product sum W = Σ_μ x^μ(x^μ)ᵀ with the diagonal zeroed — no iterative training, just one matrix operation. Pattern X is then corrupted by flipping 8 of its 25 bits, and the network runs asynchronoussign(Wy) updates (one neuron at a time, random order) until no neuron flips. The recalled image is bit-for-bit identical to the original X: this is content-addressable memory / pattern completion, exactly the CA3-hippocampus analogy. The right panel tracks the energy E = −½ yᵀWy after every neuron flip — it descends monotonically to a local minimum and never increases, which is the lecture’s guarantee E(y^(t+1)) ≤ E(y^(t)) and the reason convergence to a stored attractor is assured. The same retrieval-by-energy-minimisation, in its continuous form, is mathematically equivalent to Transformer Self-Attention (Modern Hopfield Networks, Ramsauer et al. 2020).
Mini-Demo: Speichern und Abrufen eines Musters
Hebb’sches Lernen ist eine einzige Zeile (W = Σ x·xᵀ), Retrieval ist eine sign-Iteration bis zur Konvergenz. Das Netz wird hier mit einem halben Muster gefüttert und konvergiert zum gespeicherten Original:
🐍 Code anzeigen / ausblenden
import numpy as npdef train(patterns): """Hebbian learning: weights = sum of outer products. Diagonal zeroed.""" N = patterns.shape[1] W = patterns.T @ patterns # Σ_μ x^μ (x^μ)^T np.fill_diagonal(W, 0) # no self-connections return W / Ndef recall(W, x, max_iter=20): """Synchronous update: x_i ← sign(Σ_j w_ij · x_j) until stable.""" for _ in range(max_iter): x_new = np.sign(W @ x) x_new[x_new == 0] = 1 # tie-break if np.array_equal(x_new, x): return x x = x_new return x# Store two 6-bit patternspatterns = np.array([ [ 1, 1, 1, -1, -1, -1], # "left-half on" [-1, 1, -1, 1, -1, 1], # "alternating"])W = train(patterns)# Present a corrupted version of pattern 1 (3 of 6 bits flipped)noisy = np.array([1, -1, 1, 1, -1, -1])print("input: ", noisy)print("retrieved:", recall(W, noisy))# → [ 1 1 1 -1 -1 -1] ✓ converged to stored pattern 1
Was der Code sichtbar macht:
Speicherung = eine Matrixoperation, kein Training-Loop. Das ist die Stärke (one-shot learning) und Schwäche (kein Lernen aus Statistik).
Retrieval ist Energieminimierung: jeder Update-Schritt senkt E garantiert → Konvergenz in einen Attraktor (ein gespeichertes Muster oder ein “Geister”-Minimum bei Überlast).
Bei > 0.14·N Mustern beginnen die Attraktoren zu interferieren und die Retrievals werden falsch — die berühmte Hopfield-Kapazitätsgrenze.
Where Hopfield Networks are still used today
Classical Hopfield Networks (Hopfield 1982) are mostly historical, but the idea of energy-based associative memory has had a major revival.
Modern Hopfield Networks (Ramsauer et al., 2020) — continuous version with exponential capacity. Mathematically equivalent to Transformer self-attention — every attention layer can be read as one step of Hopfield retrieval over a continuous memory.
Biologically inspired memory models — used in computational neuroscience to model hippocampal CA3 pattern completion (the “remember a face from a glimpse” mechanism).
A³-Bench (2024) — benchmark that explicitly tests LLMs for Hopfield-style attractor reasoning.
Content-addressable memory in robotics & sensor fusion — small-scale systems that need to retrieve patterns from noisy partial cues.
Energy-based models (EBMs) revival — LeCun’s recent work on Joint-Embedding Predictive Architectures (JEPA) uses Hopfield-style energy landscapes.
Where Hopfield Networks were replaced — and by what
Domain
Was Hopfield, now …
Why
General associative memory in ML
Transformer self-attention
Modern Hopfield Networks are literally the same math — Transformers won because they integrate cleanly with backprop and scale to billions of parameters
Pattern recognition (early 1980s use)
CNNs, MLPs trained with backprop
Hopfield’s storage capacity is ~0.14·N (catastrophic interference past that); deep nets have effectively unlimited capacity per parameter
Hopfield-Tank optimization was unreliable; SA and ILP solvers dominate combinatorial optimization
Where Hopfield ideas still stand: when you need a retrieval mechanism that can be analyzed in terms of energy minimization and attractor dynamics — particularly in neuroscience-inspired AI and EBM research. Pure classical Hopfield Networks are rarely deployed.