Consistent assignment — no constraint is currently violated. Can be partial or complete.
A partial assignment is consistent if every constraint between already-assigned variables holds. It says nothing about unassigned ones.
Example consistent partial:{WA = R, NT = G} — WA ≠ NT holds ✓. The fact that we haven’t yet decided SA is irrelevant.
Example inconsistent partial:{WA = R, NT = R} — violates WA ≠ NT ✗.
Solution — an assignment that is complete AND consistent. Both conditions are required.
⚠️ Common confusion: “consistent” alone is weaker than “solution”. You can have a consistent partial assignment that cannot be extended to a solution — and that is exactly when backtracking happens.
Properties of the whole CSP
Consistent CSP — at least one solution exists. The CSP as a whole is solvable.
Inconsistent CSP — no solution exists at all. Every complete assignment violates something.
Example: Australia with only 2 colors {R, G} is inconsistent — the subgraph around SA contains an odd cycle (NT–Q–NSW–V–SA–NT, length 5) and odd cycles are not 2-colorable.
⚠️ Note the double use of “consistent”: about an assignment (no violation now) vs. about a CSP (a solution exists). Context disambiguates.
Search-tree vocabulary
Search-tree node — one partial assignment reached during search. Every node represents the decisions made on the path from ROOT.
Branching — at each node we choose one variable and try its values; each value becomes a child node.
Backtracking — when a branch turns out to lead nowhere, we undo the last assignment and try the next value. The visual sign: the search “goes back up” the tree.
Dead end (✗) — a node where a constraint is violated or a forward-checked domain became empty. Search cannot continue here — backtrack.
Solution leaf (✓) — a leaf node where the assignment is both complete and consistent.
Pruning — cutting branches we know cannot lead to a solution, without exploring them. The cheaper the pruning rule, the more often we afford to apply it.
Pruning techniques (cheap → expensive)
Node Consistency (NC) — 1-consistency, the cheapest level of all. For each variable Xᵢ, remove from D(Xᵢ) every value that violates a unary constraint Pᵢ on Xᵢ (e.g. “WA ≠ R”). A single pass over the domains; run once before everything else — AC-3 and the k-consistency definitions assume it already holds (slides: PC = 3-consistency provided node consistency holds). If any domain empties → CSP inconsistent. Cost: O(n · d) (= O(Σᵢ |Dᵢ|)).
Forward Checking (FC) — after assigning X = v, remove v from the domain of every unassigned neighbor of X. If some neighbor’s domain becomes empty (domain wipeout), the branch is dead → backtrack one step earlier than naive BT. Cheapest pruning during search. Cost per step: O(d · |neighbours|).
Arc Consistency (AC) / AC-3 — preprocessing. For every binary constraint between Xᵢ and Xⱼ, ensure that every value in D(Xᵢ) has at least one supporting value in D(Xⱼ). AC-3 propagates this until no more values can be removed. Can detect inconsistency before search even starts, but is still only 2-consistency — misses triangle-style failures. Cost: O(c · d³).
Path Consistency (PC) / PC-2 — one level up from arc consistency. For every triple (Xᵢ, Xⱼ, Xₘ): for every arc-consistent assignment of (Xᵢ, Xⱼ), there must exist a value for Xₘ that is arc-consistent with both. Catches inconsistencies AC misses — e.g. the classic triangle with 2 colours {R, B} is arc-consistent but path-inconsistent (no value for the third node works once two pairwise-≠ constraints are committed). Cost: O(n³ · d⁵) — too expensive to run as standard preprocessing; reserved for tight, small CSPs.
k-Consistency (general form) — any consistent assignment to (k − 1) variables can be extended to a k-th. AC = 2-consistency. PC = 3-consistency (slides explicitly add: provided node consistency holds). Higher k → catches deeper interactions, but cost grows exponentially.
Strong k-consistency — j-consistent for all j ≤ k. Strictly stronger than plain k-consistency. Strong n-consistency (n = number of variables) ⇒ greedy assignment finds a solution without backtracking — but the cost of enforcing it is typically larger than just searching.
MAC (Maintaining Arc Consistency) — search-time variant: run a full AC-3 pass on the remaining sub-CSP after each assignment. More aggressive than FC; less aggressive than enforcing path consistency at every step.
Summary of the hierarchy.
Technique
Consistency level
When run
Cost
Catches
Node Consistency
1-consistency
preprocessing
O(n·d)
unary-constraint violations
Forward Checking
local 1-step
during search
very cheap
pairwise wipeouts after each commit
AC-3
2-consistency
preprocessing
O(c·d³)
pairwise unsupported values
PC-2
3-consistency
preprocessing
O(n³·d⁵)
triangle / 3-variable interactions (e.g. odd cycles in 2-colouring)
k-Consistency
k-consistency
preprocessing
exponential in k
k-variable interactions
Strong n-consistency
full
preprocessing
exponential
everything ⇒ no backtracking needed
Variable & value ordering heuristics
MRV (Minimum Remaining Values) — choose the next variable with the fewest values left. Purpose: fail-fast — if a subtree is doomed, find out near the root. ⚠ Also called “First-Fail” in the WiSe 2024_25 slides (slide deck Session 03 explicitly says “First-Fail Variable Selection Heuristic (Minimum Remaining Values)”) and “Most-Constrained Variable” in some textbooks — three names, same algorithm.
Degree heuristic — picks the variable involved in the most constraints with unassigned variables. ⚠️ The slides (Session 03, slide 19) present this as a standalone alternative to MRV, not as a tiebreaker. R&N additionally use it as MRV’s tiebreaker, which is the convention used in the quiz.
LCV (Least Constraining Value) — choose the next value (for the variable picked by MRV) that rules out the fewest options for neighbors. Purpose: fail-slow within a variable — keep our future options open.
⚠️ MRV is for variable selection, LCV for value selection — they do not conflict and are typically used together.
Key Algorithms / Concepts
CSP Formal Definition
Tuple ({X₁, …, Xₙ}, {D₁, …, Dₙ}, R)
Variables Xᵢ, domains Dᵢ (possible values), constraints R (relations over variables)
How naive BT actually works (common misconception cleared up):
The variable order is FIXED, not random — usually the declaration order (e.g. WA → NT → Q → NSW → V → SA → T). Naive BT always processes variables in this same sequence. (Picking the variable smartly is exactly what MRV adds on top.)
For each variable, try values in domain order, one at a time.
For each value, check whether it violates any constraint with already-assigned variables:
Violated → skip, try next value.
OK → commit it, recurse to the next variable.
If no value works for the current variable → backtrack: undo the last commitment in the caller and continue with its next value.
Stop at the first complete consistent assignment (= solution). Naive BT does not enumerate all solutions — it returns the first one it finds.
When does the tree get fully exhausted? Only when no solution exists (e.g. 2-colour Australia). Then BT has to walk every branch to prove infeasibility — that’s why the 2-colour tree has 22 nodes vs. the 3-colour tree’s 11.
What never gets tried: any partial assignment that already violates a constraint, and any subtree pruned because an ancestor decision led to a dead end. In the 3-colour tree, naive BT visits 11 nodes — it never even explores the WA = G or WA = B subtrees because WA = R succeeded.
Heuristics that change this behaviour:
MRV (Minimum Remaining Values) — replace “fixed declaration order” with “smallest current domain” → fail early.
Degree heuristic — tiebreaker for MRV: among ties, pick the variable with the most constraints on unassigned variables.
LCV (Least Constraining Value) — replace “domain order” with “value that rules out fewest neighbour options” → keep options open.
Forward Checking / MAC — after committing, prune neighbour domains so that future dead ends get caught earlier or never reached.
When are neighbour constraints actually checked? ⭐
A natural confusion: when WA gets assigned, WA has neighbours NT and SA — but only the fixed variable order tells us when those will be assigned. Where do we verify that WA’s choice is compatible with them?
Rule. Each binary constraint is checked exactly once: at the moment the second of its two variables gets assigned. The first one to be assigned simply commits — it has no information yet about its still-unassigned neighbours.
Trace for 3-colour Australia, order WA → NT → Q → NSW → V → SA → T:
Step
Var assigned
Already-assigned neighbours
Constraints checked NOW
1
WA = R
(none yet)
— nothing to check
2
NT = G
WA
NT ≠ WA
3
Q = R
NT
Q ≠ NT
4
NSW = G
Q
NSW ≠ Q
5
V = R
NSW
V ≠ NSW
6
SA = B
WA, NT, Q, NSW, V
SA ≠ WA, SA ≠ NT, SA ≠ Q, SA ≠ NSW, SA ≠ V (5 at once!)
7
T = R
(none — T isolated)
—
So:
WA ≠ NT is checked at step 2 (when NT lands).
WA ≠ SA is checked at step 6 (when SA lands).
NT ≠ SA is also checked at step 6.
WA itself never has to “look ahead” at SA — it just sits there with value R. When SA finally gets its turn, SA looks back at WA (and NT, Q, NSW, V) and must respect all of them.
Why SA is the bottleneck. By the time we reach SA in the fixed order, all 5 of its neighbours are locked in. SA gets very little freedom: it must satisfy 5 constraints simultaneously, with only 3 colours to choose from. This is exactly why MRV would have picked SA first instead of last — discover the bottleneck early, fail fast if needed.
This is also why forward checking helps. Instead of waiting until SA’s turn to discover conflicts, FC removes WA’s value from D(SA) the moment WA is assigned — so SA never even tries WA’s colour and a wipeout is detected immediately if SA’s domain runs out.
Same problem, every algorithm — side by side 🔍
Australia map coloring with 3 colours (R, G, B). Below is the exact same problem solved by each algorithm, in the same table format, so you can see what changes per technique.
Adjacency reminder
WA — NT — Q
\ / \ /
SA — NSW
/ \
V (T isolated)
SA touches 5 neighbours (WA, NT, Q, NSW, V). T touches 0.
Same variable / value order as Algorithm 1. After each assignment, remove the just-chosen value from every unassigned neighbour’s domain. If a neighbour’s domain becomes empty (wipeout) → backtrack immediately.
Step
Var = val
Domain pruning after FC
Wipeout?
1
WA = R
D(NT) = {G, B}, D(SA) = {G, B}
no
2
NT = G (NT’s domain is now {G,B} — G is first)
D(SA) = {B}, D(Q) = {R, B}
no
3
Q = R (Q’s domain {R,B})
D(NSW) = {G, B}, D(SA) = {B} unchanged
no
4
NSW = G (NSW’s domain {G,B})
D(V) = {R, B}, D(SA) = {B} unchanged
no
5
V = R (V’s domain {R,B})
D(SA) = {B} unchanged
no
6
SA = B (SA’s domain {B} — only choice)
(no unassigned neighbour left)
no
7
T = R (unchanged domain {R,G,B})
—
no
Result: solution, 7 nodes expanded. Steps 2, 4, 6 had no failed value attempts because FC had already pruned the conflicting options before search reached them.
Algorithm 3 — Backtracking + MRV + Degree (no FC)
MRV picks the variable with the smallest current domain; degree tiebreaks. Without FC, all domains stay {R, G, B} → every step ties on size, so degree to unassigned neighbours decides.
Step
Var picked by MRV (why)
Already-assigned neighbours
Value chosen (domain order)
Constraints checked NOW
1
SA (degree 5: WA, NT, Q, NSW, V)
(none yet)
R
—
2
NT (degree 2: WA, Q; tied with Q, NSW, NSW — first by decl)
SA
G (R fails: NT=R conflicts with SA=R)
NT ≠ SA
3
Q (degree 2: NSW; tied with NSW)
SA, NT
B (R fails: Q=R conflicts with SA; G fails: NT=G)
Q ≠ SA, Q ≠ NT
4
NSW (degree 1: V)
SA, Q
G (R fails: SA; B fails: Q)
NSW ≠ SA, NSW ≠ Q
5
V (degree 0)
SA, NSW
B (R fails: SA; G fails: NSW)
V ≠ SA, V ≠ NSW
6
WA (no unassigned neighbours)
SA, NT
B (R fails: SA; G fails: NT)
WA ≠ SA, WA ≠ NT
7
T (isolated)
—
R
—
Result: solution, 11 nodes expanded (similar to naïve here because no FC compounding). MRV’s value comes from order — SA’s bottleneck is dealt with first, so the rest of the search has full information about the most constrained variable from step 1.
Algorithm 4 — Backtracking + MRV + LCV + FC (the standard industrial recipe)
Step
Var by MRV (why)
Value by LCV (why)
Domain pruning after FC
Constraints checked NOW
1
SA (size 3, degree 5 wins tie)
R (all 3 values rule out the same options → tie, pick R)
D(WA)=D(NT)=D(Q)=D(NSW)=D(V) = {G, B}
—
2
NT (size 2; degree 2 to {WA, Q})
G (G and B equally constrain neighbours → tie)
D(WA) = {B}, D(Q) = {B}
NT ≠ SA
3
Q (size 1; degree 1 to NSW; beats WA which has degree 0)
B (only choice)
D(NSW) = {G} (B removed)
Q ≠ SA, Q ≠ NT
4
NSW (size 1)
G (only choice)
D(V) = {B} (G removed)
NSW ≠ SA, NSW ≠ Q
5
WA (size 1) or V (size 1) — declaration order picks WA
B
(no unassigned neighbour)
WA ≠ SA, WA ≠ NT
6
V (size 1)
B
—
V ≠ SA, V ≠ NSW
7
T (size 3)
R (LCV: T has no neighbours, tie)
—
—
Result: solution {SA=R, NT=G, Q=B, NSW=G, WA=B, V=B, T=R}, 7 nodes, zero backtracking. Notice: completely different solution than Algorithm 1, but equally valid.
Algorithm 5 — AC-3 preprocessing (no search needed yet)
Run AC-3 on the initial CSP before backtracking starts.
Arc processed
Pruning
All arcs (WA,NT), (NT,WA), (WA,SA), (SA,WA), …
For every value in D(Xᵢ), check if some value in D(Xⱼ) differs. With D = {R, G, B}, every value has 2 supporters → nothing removed.
Result: domains unchanged. Initial AC-3 is useless on this CSP because each value has plenty of supporters. AC-3 only kicks in once unary constraints (e.g. “WA = R” hard-coded) or earlier assignments have shrunk some domain enough for support to disappear. Backtracking still has to do all the work.
(Contrast: on the 2-colour version of the same problem, AC-3 also doesn’t help — the CSP is arc-consistent despite being infeasible. Only path-consistency or stronger could catch that.)
Algorithm 6 — MAC (Maintaining Arc Consistency)
MAC = FC, but instead of one-step pruning it runs a full AC-3 pass on the remaining sub-CSP after each assignment. With MRV+LCV+MAC on Australia:
Step
Assignment
MAC propagation beyond FC
1
SA = R
FC: 5 neighbour domains shrink to {G, B}. AC-3 propagation: check arc (NT, Q): for NT=G, Q must have a non-G value → Q has B ✓. Symmetric for NT=B → no further pruning.
2
NT = G
FC: D(Q) = {B}, D(SA) = {B} (already). AC-3 propagation: arc (NSW, Q) — for NSW=G, Q must ≠G → Q={B} works ✓. For NSW=B, Q must ≠B → Q={B} fails → remove NSW=B, so D(NSW) = {G}. Then arc (V, NSW): V=G fails (NSW=G), V=B is fine, V=R is fine → D(V) = {R, B}. Then arc (V, SA): SA={B} forces V=R or any non-B → no further change.
3
Q = B
(forced — only value left) FC: no new removals. AC-3 nothing to propagate.
4
NSW = G
(forced) FC: D(V) loses G → D(V) = {R, B}.
5
V = R
(LCV picks R since SA=R already not blocking V — actually V≠SA so V≠R → V must be B) Correction: D(V) became {R,B} but V≠SA constraint means V≠R, so D(V) = {B}. V = B forced.
6
WA
(only value)
7
T
(any)
Result: solution in 7 nodes, with more aggressive pruning per step. MAC’s value on small CSPs is similar to FC. On hard CSPs (e.g. tightly constrained Sudoku, NFL scheduling) MAC can shave orders of magnitude off the node count compared to FC.
Algorithm 7 — Min-Conflicts (Local Search)
Completely different paradigm: start from a random complete assignment (probably violating), then repeatedly pick a conflicted variable and reassign to the value that minimises conflicts.
After a few more flips Min-Conflicts settles on a valid solution like {WA=G, NT=B, Q=R, NSW=B, V=R, SA=G, T=R}. No tree, no backtracking — just local greedy flips.
Trade-off: super fast on huge CSPs (n-queens for n = 1 000 000 solves in milliseconds), but incomplete — can get stuck on plateaus and cannot prove a CSP unsolvable.
Summary — same problem, different algorithm cost
Algorithm
Nodes expanded
Backtracking?
Solution found
Naïve Backtracking
11
Yes (steps 2, 4, 6)
{WA=R, NT=G, Q=R, NSW=G, V=R, SA=B, T=R}
BT + FC
7
No
same as above
BT + MRV + Degree (no FC)
11
A bit (forced fails)
varies by order
BT + MRV + LCV + FC ⭐
7
No
{SA=R, NT=G, Q=B, NSW=G, WA=B, V=B, T=R}
AC-3 only (preprocessing)
0 reductions
—
nothing — domains untouched
MAC + MRV + LCV
7
No
same as MRV+LCV+FC, smarter pruning
Min-Conflicts
~4–8 flips (not nodes)
—
a different valid solution
Big takeaway. On this tiny CSP all good combinations end at ~7 nodes — but the paths are very different. On large CSPs:
FC alone shrinks the tree by a constant factor.
MRV+LCV+FC shrinks the order of magnitude.
MAC adds yet more savings on hard instances.
AC-3 preprocessing only helps when domains can actually be pruned (unary constraints, tight constraint chains like A<B<C).
Min-Conflicts wins when the CSP is huge but easy somewhere — wrong tool for proving infeasibility.
Consistency Concepts
Type
Definition
Algorithm
Node consistency
Every value in Dᵢ satisfies unary constraints on Xᵢ
Remove invalid values
Arc consistency
For every Xᵢ, every value has a support in Xⱼ (AC-1, AC-3)
AC-3
Path consistency
For consistent (Xᵢ, Xⱼ) assignment, ∃ Xₘ consistent with both
Hard to compute
k-consistency
Any consistent (k-1)-assignment extends to k variables
Strong k-consistency
AC-3 Algorithm (Arc Consistency)
Put all arcs (Xᵢ, Xⱼ) in a queue.
For each arc: remove values from Dᵢ that have no support in Dⱼ.
If Dᵢ shrinks, re-add all arcs pointing to Xᵢ.
Repeat until queue empty. If any domain becomes empty → inconsistent.
Time complexity: O(cd³) where c = constraints (arcs), d = domain size. (from R&N textbook — not stated explicitly in slides)
Local Search for CSPs
State space: all complete assignments (even violated ones).
Neighborhood: change one variable’s value.
Objective: maximize number of satisfied constraints.
If globally optimal: CSP is consistent (solution found). If locally optimal but f(s) < constraints: can’t conclude (may be stuck or CSP inconsistent).
RCC-8 (Spatial Relations) ⚠️ (not in Session 03 slides — covered in KR I/II)
Based on connection predicate C(x,y). No need for metric — topology-based.
Formulas & Notation
Symbol
Meaning
Xᵢ
Variable i
Dᵢ
Domain of variable i
R
Set of constraints
AC-3 complexity
O(cd³)
MRV
Minimum Remaining Values heuristic
LCV
Least Constraining Value heuristic
Common Exam Traps ⚠️
AC-3 removes values from domains — it changes the CSP structure, not just tests it.
Arc consistency ≠ path consistency: 2-consistent doesn’t mean 3-consistent.
LCV is for value ordering (keep options open); MRV is for variable ordering (fail early). Different goals, don’t mix.
If local search finds global optimum on CSP objective → CSP is consistent. If local optimum with violations → you can’t tell.
Strong k-consistency means j-consistent for all j ≤ k. Regular k-consistency only guarantees at level k.
Backtracking is not local search — it uses partial assignments; local search uses complete (possibly invalid) assignments.
Quick Comparison Table
Approach
Assignment type
Completeness
When to use
Backtracking
Partial
Complete
General CSPs
Local Search
Complete (may violate)
Incomplete
Large CSPs, soft constraints OK
AC-3 preprocessing
N/A
—
Before any search (reduce domains)
MRV
Variable selection heuristic
—
Always good to combine with BT
Visualisations (Python)
These three figures used to live in the lernzettel. They illustrate the Australia map-coloring CSP and what each solving strategy does to the search tree. Render each toggle and read the explanation underneath.
🐍 Figure — Australia constraint graph with solution coloring
# Pyodide / Obsidian Execute Code: install matplotlib first.import micropipawait micropip.install("matplotlib")import matplotlib.pyplot as pltfrom matplotlib.patches import Circle, FancyBboxPatch# Adjacency: "must differ" constraint between neighborsNEIGHBORS = { "WA": ["NT", "SA"], "NT": ["WA", "SA", "Q"], "SA": ["WA", "NT", "Q", "NSW", "V"], "Q": ["NT", "SA", "NSW"], "NSW": ["Q", "SA", "V"], "V": ["SA", "NSW"], "T": [],}# A valid 3-coloring (found by the solver below)SOLUTION = {"WA": "R", "NT": "G", "Q": "R", "NSW": "G", "V": "R", "SA": "B", "T": "R"}# Rough geographic layout of Australian territoriesPOS = { "WA": (0.5, 2.5), "NT": (2.5, 4.0), "Q": (4.5, 3.5), "SA": (2.5, 2.0), "NSW": (4.5, 1.5), "V": (4.0, 0.3), "T": (5.0, -0.8),}COLOR_MAP = {"R": "#e74c3c", "G": "#27ae60", "B": "#3498db"}fig, ax = plt.subplots(figsize=(8, 6))# Draw constraint edges (each = "≠" between neighbors)drawn = set()for v, nbs in NEIGHBORS.items(): for n in nbs: if (n, v) in drawn: continue x1, y1 = POS[v]; x2, y2 = POS[n] ax.plot([x1, x2], [y1, y2], "-", color="#999", lw=2, zorder=1) drawn.add((v, n))# Draw nodes coloured by the solutionfor v, (x, y) in POS.items(): col = COLOR_MAP[SOLUTION[v]] ax.add_patch(Circle((x, y), 0.42, facecolor=col, edgecolor="black", lw=2, zorder=2)) ax.text(x, y, v, ha="center", va="center", fontsize=12, fontweight="bold", color="white", zorder=3)# Legendfor label, col in COLOR_MAP.items(): ax.plot([], [], "o", color=col, markersize=14, label=label)ax.legend(loc="upper left", fontsize=11, framealpha=0.9)ax.set_title("Australia Map Coloring — constraint graph with a valid 3-colouring\n" "(edges = 'must differ' constraints; T is isolated → free choice)", fontsize=11)ax.set_xlim(-0.5, 6.5); ax.set_ylim(-1.6, 5.0)ax.set_aspect("equal"); ax.axis("off")plt.tight_layout(); plt.show()
What this graph shows. The 7 nodes are the 7 Australian territories (WA, NT, SA, Q, NSW, V, T). Every grey edge represents one binary constraint “these two neighbours must have different colours.” The fill colour of each node is one valid 3-colouring (a solution): WA=R, NT=G, SA=B, Q=R, NSW=G, V=R, T=R.
Things to notice.
SA is the central hub with 5 neighbours (WA, NT, Q, NSW, V). It is the most constrained variable — which is why MRV+degree picks it first during search.
T is isolated (Tasmania has no land borders). It can take any colour.
Every edge connects two differently-coloured nodes ⇒ the colouring is consistent, and since all 7 variables are filled, it is a complete & consistent assignment = solution.
# Pyodide / Obsidian Execute Code: install matplotlib first.import micropipawait micropip.install("matplotlib")from collections import dequefrom dataclasses import dataclass, fieldimport matplotlib.pyplot as pltfrom matplotlib.patches import FancyBboxPatchVARIABLES = ["WA", "NT", "Q", "NSW", "V", "SA", "T"]NEIGHBORS = { "WA": ["NT", "SA"], "NT": ["WA", "SA", "Q"], "SA": ["WA", "NT", "Q", "NSW", "V"], "Q": ["NT", "SA", "NSW"], "NSW": ["Q", "SA", "V"], "V": ["SA", "NSW"], "T": [],}def consistent(var, value, assignment): return all(assignment.get(n) != value for n in NEIGHBORS[var])@dataclassclass Node: var: str = None value: str = None children: list = field(default_factory=list) status: str = "ok"def solve(colors, use_mrv=False, use_lcv=False, forward_check=False): domains = {v: list(colors) for v in VARIABLES} root = Node() def select_var(asgn, dom): un = [v for v in VARIABLES if v not in asgn] if use_mrv: return min(un, key=lambda v: (len(dom[v]), -sum(1 for n in NEIGHBORS[v] if n not in asgn))) return un[0] def order_vals(var, asgn, dom): if not use_lcv: return list(dom[var]) return sorted(dom[var], key=lambda val: sum( 1 for nb in NEIGHBORS[var] if nb not in asgn and val in dom[nb])) def bt(asgn, dom, parent): if len(asgn) == len(VARIABLES): parent.status = "solution"; return True var = select_var(asgn, dom) for val in order_vals(var, asgn, dom): child = Node(var=var, value=val) parent.children.append(child) if not consistent(var, val, asgn): child.status = "fail"; continue asgn[var] = val saved = {}; wipeout = False if forward_check: for n in NEIGHBORS[var]: if n not in asgn and val in dom[n]: saved.setdefault(n, list(dom[n])) dom[n] = [d for d in dom[n] if d != val] if not dom[n]: wipeout = True if not wipeout and bt(asgn, dom, child): return True for n, d in saved.items(): dom[n] = d del asgn[var] if child.status == "ok": child.status = "fail" return False bt({}, domains, root) return root# --- tree layout: assign x via leaf positions, y via depth ---def layout(root): positions = {} leaf_counter = [0] def visit(node, depth): if not node.children: positions[id(node)] = (leaf_counter[0], -depth) leaf_counter[0] += 1 return xs = [] for c in node.children: visit(c, depth + 1) xs.append(positions[id(c)][0]) positions[id(node)] = ((min(xs) + max(xs)) / 2, -depth) visit(root, 0) return positionsCOLOR_MAP = {"R": "#e74c3c", "G": "#27ae60", "B": "#3498db"}EDGE_COLOR = {"ok": "#888", "fail": "#c0392b", "solution": "#27ae60"}def draw_tree(ax, root, title): pos = layout(root) def edges(node): for c in node.children: yield (node, c); yield from edges(c) # edges for parent, child in edges(root): x1, y1 = pos[id(parent)]; x2, y2 = pos[id(child)] ax.plot([x1, x2], [y1, y2], "-", color=EDGE_COLOR[child.status], lw=1.5, alpha=0.7, zorder=1) # nodes for node, (x, y) in [(n, pos[id(n)]) for n in _all(root)]: if node.var is None: ax.text(x, y, "ROOT", ha="center", va="center", fontsize=8, fontweight="bold", bbox=dict(boxstyle="round,pad=0.3", fc="#34495e", ec="black"), color="white", zorder=3) else: fc = COLOR_MAP[node.value] ec = {"ok": "black", "fail": "#c0392b", "solution": "#27ae60"}[node.status] lw = 2.5 if node.status in ("fail", "solution") else 1 mark = {"ok": "", "fail": " ✗", "solution": " ✓"}[node.status] ax.text(x, y, f"{node.var}={node.value}{mark}", ha="center", va="center", fontsize=7, color="white", fontweight="bold", bbox=dict(boxstyle="round,pad=0.25", fc=fc, ec=ec, lw=lw), zorder=3) ax.set_title(title, fontsize=10) ax.axis("off")def _all(node): yield node for c in node.children: yield from _all(c)# --- build trees ---t1 = solve(["R","G","B"], use_mrv=False, use_lcv=False, forward_check=False)t2 = solve(["R","G"], use_mrv=False, use_lcv=False, forward_check=False)def count(n): return 1 + sum(count(c) for c in n.children)fig, axes = plt.subplots(1, 2, figsize=(14, 7))draw_tree(axes[0], t1, f"3 colours · naive BT — {count(t1)-1} nodes · SOLUTION FOUND ✓")draw_tree(axes[1], t2, f"2 colours · naive BT — {count(t2)-1} nodes · NO SOLUTION (exhaustive ✗)")plt.tight_layout(); plt.show()print("\nLeft: lucky linear descent — 11 nodes, one ✓ leaf.")print("Right: exhaustive failure — 22 nodes, every leaf ✗.")print("Every red branch = a constraint violation forcing backtracking.")
What these trees show. Two side-by-side search trees from naive backtracking (no heuristics, no forward checking). Each node label VAR=val is one decision the algorithm made. Path from ROOT to a leaf = the current partial assignment.
Left panel (3 colours, feasible). A nearly linear descent of 11 nodes. A few dead-end siblings (e.g. NT=R ✗, NSW=R ✗) get cut off because they immediately violate a constraint with an already-assigned variable. Eventually the algorithm reaches T=R ✓ — a complete consistent assignment, found.
Right panel (2 colours, infeasible). 22 nodes, every leaf is red ✗. With only {R, G} the graph contains an odd cycle (NT–Q–NSW–V–SA–NT, length 5) so no 2-colouring exists. Naive BT has to exhaust both halves of the tree (WA=R subtree + WA=G subtree) to prove this — 22 wasted node expansions where there’s nothing to find.
Why this matters. Naive BT is OK when you’re lucky and the first variable order works, but it falls apart on infeasible or tightly-constrained problems. The next figure shows what heuristics do.
# Pyodide / Obsidian Execute Code: install matplotlib first.import micropipawait micropip.install("matplotlib")from collections import dequefrom dataclasses import dataclass, fieldimport matplotlib.pyplot as pltVARIABLES = ["WA", "NT", "Q", "NSW", "V", "SA", "T"]NEIGHBORS = { "WA": ["NT", "SA"], "NT": ["WA", "SA", "Q"], "SA": ["WA", "NT", "Q", "NSW", "V"], "Q": ["NT", "SA", "NSW"], "NSW": ["Q", "SA", "V"], "V": ["SA", "NSW"], "T": [],}def consistent(var, value, asgn): return all(asgn.get(n) != value for n in NEIGHBORS[var])@dataclassclass Node: var: str = None value: str = None children: list = field(default_factory=list) status: str = "ok" # "ok" | "fail" | "solution"# --- AC-3 ---def ac3(domains): queue = deque((xi, xj) for xi in VARIABLES for xj in NEIGHBORS[xi]) def revise(xi, xj): revised = False for x in list(domains[xi]): if not any(x != y for y in domains[xj]): domains[xi].remove(x); revised = True return revised while queue: xi, xj = queue.popleft() if revise(xi, xj): if not domains[xi]: return False for xk in NEIGHBORS[xi]: if xk != xj: queue.append((xk, xi)) return True# --- backtracking ---def solve(colors, use_mrv=False, use_lcv=False, fc=False, ac3_pre=False): domains = {v: list(colors) for v in VARIABLES} root = Node() nodes = [0] if ac3_pre and not ac3(domains): return root, 0, False def select_var(asgn, dom): un = [v for v in VARIABLES if v not in asgn] if use_mrv: return min(un, key=lambda v: (len(dom[v]), -sum(1 for n in NEIGHBORS[v] if n not in asgn))) return un[0] def order_vals(var, asgn, dom): if not use_lcv: return list(dom[var]) return sorted(dom[var], key=lambda val: sum( 1 for nb in NEIGHBORS[var] if nb not in asgn and val in dom[nb])) def bt(asgn, dom, parent): if len(asgn) == len(VARIABLES): parent.status = "solution"; return True var = select_var(asgn, dom) for val in order_vals(var, asgn, dom): nodes[0] += 1 child = Node(var=var, value=val) parent.children.append(child) if not consistent(var, val, asgn): child.status = "fail"; continue asgn[var] = val saved = {}; wipeout = False if fc: for n in NEIGHBORS[var]: if n not in asgn and val in dom[n]: saved.setdefault(n, list(dom[n])) dom[n] = [d for d in dom[n] if d != val] if not dom[n]: wipeout = True if not wipeout and bt(asgn, dom, child): return True for n, d in saved.items(): dom[n] = d del asgn[var] if child.status == "ok": child.status = "fail" return False ok = bt({}, domains, root) return root, nodes[0], ok# --- tree layout & draw ---def layout(root): pos = {}; leaf = [0] def visit(n, d): if not n.children: pos[id(n)] = (leaf[0], -d); leaf[0] += 1; return xs = [] for c in n.children: visit(c, d+1); xs.append(pos[id(c)][0]) pos[id(n)] = ((min(xs)+max(xs))/2, -d) visit(root, 0); return posdef all_nodes(n): yield n for c in n.children: yield from all_nodes(c)COLOR_MAP = {"R": "#e74c3c", "G": "#27ae60", "B": "#3498db"}EDGE_COLOR = {"ok": "#888", "fail": "#c0392b", "solution": "#27ae60"}def draw_tree(ax, root, title, fontsize=6): if not root.children: ax.text(0.5, 0.5, "AC-3 wipeout\n(no search needed)", ha="center", va="center", fontsize=10, transform=ax.transAxes, bbox=dict(boxstyle="round,pad=0.5", fc="#f39c12", ec="black")) ax.set_title(title, fontsize=9); ax.axis("off"); return pos = layout(root) for parent in all_nodes(root): for child in parent.children: x1, y1 = pos[id(parent)]; x2, y2 = pos[id(child)] ax.plot([x1, x2], [y1, y2], "-", color=EDGE_COLOR[child.status], lw=1, alpha=0.6, zorder=1) for n in all_nodes(root): x, y = pos[id(n)] if n.var is None: ax.text(x, y, "R", ha="center", va="center", fontsize=fontsize+1, fontweight="bold", bbox=dict(boxstyle="circle,pad=0.2", fc="#34495e", ec="black"), color="white", zorder=3) else: fc = COLOR_MAP[n.value] ec = {"ok": "black", "fail": "#c0392b", "solution": "#27ae60"}[n.status] lw = 2 if n.status in ("fail", "solution") else 0.7 mk = {"ok": "", "fail": "✗", "solution": "✓"}[n.status] ax.text(x, y, f"{n.var}={n.value}{mk}", ha="center", va="center", fontsize=fontsize, color="white", fontweight="bold", bbox=dict(boxstyle="round,pad=0.15", fc=fc, ec=ec, lw=lw), zorder=3) ax.set_title(title, fontsize=9); ax.axis("off")STRATEGIES = [ ("Naive BT", dict(use_mrv=False, use_lcv=False, fc=False, ac3_pre=False)), ("+ Forward Check", dict(use_mrv=False, use_lcv=False, fc=True, ac3_pre=False)), ("+ MRV + LCV + FC", dict(use_mrv=True, use_lcv=True, fc=True, ac3_pre=False)), ("+ AC-3 preproc.", dict(use_mrv=True, use_lcv=True, fc=True, ac3_pre=True)),]# === Figure: 2 rows (3-col / 2-col) × 4 cols (strategies) ===fig, axes = plt.subplots(2, 4, figsize=(18, 10))for col, (label, opts) in enumerate(STRATEGIES): # row 0: 3 colors (feasible) root, n, ok = solve(["R","G","B"], **opts) marker = "✓" if ok else "✗" draw_tree(axes[0, col], root, f"3 colours · {label}\n{n} nodes · {marker}", fontsize=6) # row 1: 2 colors (infeasible) root, n, ok = solve(["R","G"], **opts) marker = "✓" if ok else "✗" draw_tree(axes[1, col], root, f"2 colours · {label}\n{n} nodes · {marker}", fontsize=6)fig.suptitle("CSP Search Trees — 4 strategies × 2 problem variants", fontsize=13, fontweight="bold")plt.tight_layout(rect=[0, 0, 1, 0.96]); plt.show()# Compact numerical summaryprint("\n=== NODES EXPANDED ===")print(f"{'Strategy':<25} {'3 colors':>10} {'2 colors':>10}")print("-" * 47)for label, opts in STRATEGIES: _, n3, _ = solve(["R","G","B"], **opts) _, n2, _ = solve(["R","G"], **opts) print(f"{label:<25} {n3:>10} {n2:>10}")print("\n⚠️ Observation: row 1 (2 colours) is INFEASIBLE — every tree ends")print(" in ✗ leaves only. AC-3 alone does NOT detect it (the CSP is")print(" arc-consistent — odd cycle needs path-consistency / 3-consistency).")
What this grid shows. Each row is one CSP variant; each column is one solving strategy. Cell title gives node count and outcome.
Row 1 (3 colours, feasible). All four strategies find a solution. Naive BT uses 11 nodes; adding Forward Checking alone drops it to 7. MRV+LCV+FC also gives 7, but with a different variable order — SA is picked first (most constrained), so the solution found has SA=R instead of WA=R. AC-3 preprocessing changes nothing here because every domain still has all 3 values after AC-3.
Row 2 (2 colours, infeasible). Naive BT explodes to 22 nodes. As soon as FC is enabled the search collapses to just 4 nodes — the second assignment in each branch immediately empties a domain (wipeout) and the branch dies. MRV+LCV adds nothing on top of that here. The rightmost cell makes the AC-3 trap visible: even AC-3 preprocessing does NOT detect the infeasibility — the 2-colour CSP IS arc-consistent, the failure comes from an odd cycle which needs 3-consistency (path consistency) to spot.
Key takeaways.
Forward Checking is the single biggest jump (22 → 4).
MRV+LCV mainly changes the order of variables, not always the node count on tiny problems — but on large CSPs they matter a lot.
AC-3 is not a silver bullet. Arc consistency ≠ global consistency.
ALGORITHMS (full reference) ⭐
Pseudocode + intuition + complexity for every algorithm you might need to write or trace by hand in the exam. Organised by role:
Preprocessing (run once before search): #1 Node Consistency · #2 AC-1 · #3 AC-3 · #4 PC-2 · #5 k-Consistency
Search-time pruning (run during backtracking): #6 Forward Checking · #7 MAC
Variable & value ordering heuristics (plug into backtracking): #10 MRV ⭐ · #11 Degree · #12 LCV
Node Consistency
Goal. Remove from every Dᵢ the values that violate the unary constraint Pᵢ.
for each variable Xᵢ: Dᵢ ← { x ∈ Dᵢ | Pᵢ(x) holds } if Dᵢ = ∅: return INCONSISTENTreturn OK
Cost. O(Σᵢ |Dᵢ|) — one pass. Use. Always run first. It’s free and shrinks domains before AC-3 even starts.
AC-1
Goal. Make every binary constraint arc-consistent. Naïve loop: keep iterating over all arcs until nothing changes.
repeat: changed ← false for each arc (Xᵢ, Xⱼ): if REVISE(Xᵢ, Xⱼ): changed ← trueuntil not changedREVISE(Xᵢ, Xⱼ): removed ← false for each x ∈ Dᵢ: if no y ∈ Dⱼ satisfies Pᵢⱼ(x, y): Dᵢ ← Dᵢ \ {x} removed ← true return removed
Cost. O(n²·d³·c) where n = variables, d = max domain size, c = constraints. Wasteful — re-checks arcs that haven’t been affected.
AC-3 ⭐
Goal. Same as AC-1 but with a worklist of arcs that might still need revising.
queue ← all arcs (Xᵢ, Xⱼ) of the CSPwhile queue not empty: (Xᵢ, Xⱼ) ← queue.pop() if REVISE(Xᵢ, Xⱼ): if Dᵢ = ∅: return INCONSISTENT for each Xₖ ∈ neighbours(Xᵢ) \ {Xⱼ}: queue.push((Xₖ, Xᵢ)) # supports for Xₖ may now be gonereturn OK
Key idea. Only re-check arcs pointing at Xᵢ when Dᵢ shrinks, because only those arcs could have lost their support.
Cost. O(c · d³) where c = arcs, d = max domain size. (Each arc can be processed at most d times — once per value removed from its source domain — and each REVISE call costs O(d²).)
What it can detect. Domain wipeout ⇒ CSP is inconsistent. What it cannot detect. Path-level / global inconsistency (see Q14: the triangle with 2 colours is arc-consistent but unsolvable).
Path Consistency (PC-2)
Goal. For every triple (i, j, m), tighten the binary constraint Pᵢⱼ so that every consistent (i, j) assignment has some compatible value for m.
queue ← all triples (i, j, m)while queue not empty: (i, j, m) ← queue.pop() if REVISE-3(i, j, m): # tightens Pᵢⱼ for each (k, l) ∈ affected_pairs: queue.push relevant triplesREVISE-3(i, j, m): removed ← false for each (x, z) ∈ Pᵢⱼ: # currently allowed pair if no y ∈ Dₘ with Pᵢₘ(x, y) ∧ Pₘⱼ(y, z): remove (x, z) from Pᵢⱼ removed ← true return removed
Cost. O(n³ · d⁵). Why we rarely run it. Expensive; in many cases the preprocessing eats more time than the search would have. Used only for tightly constrained problems.
k-Consistency
The one idea. Node / arc / path consistency are all the same template at different sizes — k-consistency just generalises it:
A CSP is k-consistent if: for any consistent assignment of (k−1) variables, and any k-th variable, there always exists a value for that k-th variable consistent with all (k−1).
Short form: “(k−1) successfully assigned ⇒ one more is guaranteed to fit.”
Plug in k = 1, 2, 3 and you recover what you already know:
k
(k−1) already assigned
The question
= the level you know as
1
0 vars (trivial)
Does each variable have a value satisfying its unary constraints?
Node Consistency
2
1 var
Does a 2nd variable have a value compatible with the 1st? (support!)
Arc Consistency
3
2 vars
Does a 3rd variable have a value compatible with both?
Path Consistency
k
(k−1) vars
Does a k-th variable have a value compatible with all (k−1)?
k-consistency
repeat: for each (k-1)-tuple of variables (X₁, …, Xₖ₋₁): for each consistent assignment a₁, …, aₖ₋₁: if ∄ variable Xₖ and value v with all constraints satisfied: tighten the (k-1)-ary constraint to forbid (a₁,…,aₖ₋₁)until nothing changes
Cost. Exponential in k. Practical only for k ≤ 3.
⚠️ k-consistency does NOT imply (k−1)-consistency
This is the classic trap. The definition only talks about extending (k−1)-sized consistent assignments — it says nothing about whether the levels below hold. So a CSP can be 3-consistent but not 2-consistent.
Party intuition. 2-consistency = “every single person finds at least one partner.” 3-consistency = “every pair that formed can recruit a third.” An outsider who fits with nobody breaks 2-consistency (no partner) — but never breaks 3-consistency, because they never form a pair, so they never block a trio. The flaw is invisible at level 3.
Concrete counterexample. Three variables A, B, C, each domain {0, 1}. Binary constraints:
Constraint
Allowed pairs
Meaning
A–B
(0,0), (0,1)
A must be 0
A–C
(0,0), (0,1)
A must be 0
B–C
all 4
no restriction
The value A=1 is the outsider — incompatible with every value of B and of C. A=0 is compatible with everything.
Not 2-consistent ✗ — A=1 is a legal single-variable assignment, but there is no pair (A=1, B=?) in the A–B table → it cannot extend to B → no support.
3-consistent ✓ — every consistent pair extends to the third variable. Note every pair involving A has A=0 (A=1 never makes it into a consistent pair):
Consistent pair
extend to
value that fits
(A=0, B=0)
C
C=0 ✓
(A=0, B=1)
C
C=0 ✓
(A=0, C=0)
B
B=0 ✓
(A=0, C=1)
B
B=0 ✓
(B=, C=) — any of 4
A
A=0 ✓ (compatible with all)
A=1 never appears in any consistent pair, so the 3-consistency check never even looks at it → the defect is missed at level 3 but caught at level 2.
Strong k-consistency. Run the enforcement for j = 1, 2, …, k in order (i.e. j-consistent for all j ≤ k). This closes the gap above — strong 3-consistency would remove A=1 via arc consistency first. If you achieve strong n-consistency, you can find a solution by greedy assignment — no backtracking needed (but enforcing it usually costs more than just searching).
Forward Checking (FC) ⭐
Goal. Each time you assign Xᵢ = v, prune v from every unassigned neighbour’s domain. Detect wipeouts immediately.
ASSIGN(Xᵢ, v): assignment[Xᵢ] ← v saved ← {} for each Xⱼ ∈ neighbours(Xᵢ) where Xⱼ is unassigned: saved[Xⱼ] ← copy of Dⱼ Dⱼ ← { y ∈ Dⱼ | Pᵢⱼ(v, y) } # remove conflicting values if Dⱼ = ∅: UNDO(saved); return WIPEOUT return OK
Cost per assignment. O(d · |neighbours(Xᵢ)|). Pruning power. Strictly less than AC-3 (only propagates one step from the just-assigned variable), but much cheaper and applied at every search node. The most common pruning method during backtracking search.
MAC (Maintaining Arc Consistency)
Goal. Like Forward Checking, but after each assignment run a full AC-3 pass on the remaining unassigned subproblem.
ASSIGN(Xᵢ, v): assignment[Xᵢ] ← v saved ← snapshot of domains Dᵢ ← {v} if AC-3(remaining CSP) returns INCONSISTENT: UNDO(saved); return FAILURE return OK
Trade-off. More expensive per assignment than FC, but prunes much more — often pays for itself on hard CSPs.
Backtracking Search
Goal. Find a complete consistent assignment by depth-first variable assignment with chronological backtracking.
Common misconception ⚠️. Naïve BT does not start at a random variable, and it does not try every possible assignment. It uses a fixed variable order (typically declaration order) and stops at the first complete consistent assignment. The tree is only fully exhausted if no solution exists — that’s how BT proves a CSP infeasible. Partial assignments that already violate a constraint are skipped, and entire subtrees go unexplored once an ancestor commitment succeeds.
BACKTRACK(assignment): if assignment is complete: return assignment Xᵢ ← SELECT-VAR(unassigned) # MRV + degree tiebreak for each v in ORDER-VALUES(Xᵢ): # LCV if v is consistent with assignment: assignment[Xᵢ] ← v inferences ← INFERENCE(Xᵢ, v) # NC / FC / MAC / nothing if inferences ≠ FAILURE: add inferences to assignment / domain bookkeeping result ← BACKTRACK(assignment) if result ≠ FAILURE: return result remove Xᵢ = v and any inferences return FAILURE
The INFERENCE step is where you plug in the consistency algorithm:
nothing → naïve backtracking
forward checking → 1-step propagation
MAC → full AC-3 propagation after each assignment
Min-Conflicts (Local Search)
Goal. Start from a complete (possibly violating) assignment, repeatedly pick a violated variable and reassign it to the value that minimises the number of remaining conflicts.
MIN-CONFLICTS(csp, max_steps): current ← random complete assignment for step in 1..max_steps: if current satisfies all constraints: return current Xᵢ ← randomly chosen conflicted variable v ← value in Dᵢ minimising #conflicts(Xᵢ = v, others) current[Xᵢ] ← v return FAILURE
When to use. Huge CSPs where backtracking is too slow (e.g. n-queens for n = 1 000 000). Very effective in practice but incomplete — cannot prove a CSP unsolvable.
MRV (Minimum Remaining Values) ⭐
Goal. Inside backtracking, pick the next variable with the fewest legal values left. Idea: fail-fast — if a subtree is doomed, discover it near the root, not after locking in 5 variables.
SELECT-VAR-MRV(unassigned, domains): return argmin over X in unassigned of |D(X)| # tiebreak: degree heuristic (see #11) — pick X with most constraints # on still-unassigned variables
Australia example. Initially every domain is {R, G, B} → all variables tied at size 3. The degree tiebreaker picks SA first because SA has 5 unassigned neighbours (more than any other). MRV thus inverts the naïve declaration order from WA → NT → Q → NSW → V → SA → T to SA → NT → Q → NSW → WA → V → T.
Cost. O(n) per call (linear scan of unassigned variables). Pros. ✅ Massive search-tree pruning on tightly constrained CSPs. ✅ Pairs naturally with FC/MAC. Cons. ⚠️ Needs an up-to-date view of current domains — works best alongside FC or MAC.
Degree Heuristic
Goal. Pick the variable with the most constraints involving unassigned variables.
SELECT-VAR-DEGREE(unassigned, neighbors): return argmax over X in unassigned of |{ Y in neighbors(X) : Y is unassigned }|
Two valid framings:
Slides (Session 03, slide 19): present Degree as a standalone alternative to MRV.
R&N textbook: use Degree as MRV’s tiebreaker (when several variables tie on smallest domain size).
Cost. O(n + total constraints) per call. Pros. ✅ Cheap. ✅ Excellent at flagging hub variables (like SA in Australia) before the search commits to “easy” leaves. Cons. ⚠️ On its own (without MRV) it ignores current domain pruning — weaker than MRV.
LCV (Least Constraining Value)
Goal. Once a variable X is chosen (by MRV+degree), order X’s domain by how few neighbour options each value rules out — try the least-disruptive value first. Idea: fail-slow within a variable; keep our future options open so that if a solution exists down this branch, we hit it early.
ORDER-VALUES-LCV(X, assignment, domains): return D(X) sorted ascending by sum over Y in unassigned neighbors(X) of |{ y in D(Y) : assigning X = v would forbid y }|
Australia example. After assigning SA = R, suppose we now branch on NT with D(NT) = {G, B}.
NT = G would remove G from neighbours WA (∈{G,B}) and Q (∈{G,B}) → conflicts with 2 values.
NT = B would remove B from the same two neighbours → conflicts with 2 values.
It’s a tie, pick the first listed.
Cost. O(d × |neighbours(X)|) per variable. Pros. ✅ Often finds a solution faster (you trip into the right branch first). Cons. ⚠️ The bookkeeping is not free. ⚠️ Useless if the goal is to prove infeasibility — you’ll have to exhaust the whole subtree anyway, so ordering doesn’t help.
How MRV / Degree / LCV plug into backtracking
BACKTRACK(assignment, domains): if assignment complete: return assignment X ← SELECT-VAR-MRV(unassigned, domains) # #10 (+ #11 tiebreak) for v in ORDER-VALUES-LCV(X, assignment, domains): # #12 if v consistent with assignment: assignment[X] ← v inferences ← FORWARD-CHECK(X, v) # #6 if inferences ≠ FAILURE: result ← BACKTRACK(assignment, domains) if result ≠ FAILURE: return result undo v and inferences return FAILURE
This is the standard BT + MRV + LCV + FC recipe — the industrial-strength CSP solver.
Algorithm comparison table — pros & cons
Side-by-side comparison of every algorithm in the reference. Read horizontally to pick the right one, vertically to compare a single dimension.
Preprocessing algorithms (run before search)
Algorithm
What it does
Cost
Pros
Cons
When to use
Node Consistency (1-cons.)
Removes values violating unary constraints
O(Σ|Dᵢ|) — one pass
✅ Trivial to implement ✅ Always pays off ✅ No propagation needed
⚠️ Only handles unary constraints — useless without them
Always, if any unary constraint exists
AC-1 (arc consistency, naïve)
Loops over ALL arcs until nothing changes
O(n²·d³·c)
✅ Simple to write
❌ Re-checks arcs that haven’t been affected — wasteful
Never in practice — superseded by AC-3
AC-3 ⭐
Worklist: re-checks only arcs that might have lost support
O(c·d³)
✅ Best practical AC algorithm ✅ Detects domain wipeout → proves inconsistency ✅ Big domain reduction on tight CSPs
❌ Only 2-consistency — misses odd-cycle infeasibilities ❌ Can be expensive on huge CSPs
Default preprocessing step in almost every CSP solver
Path Consistency (PC-2)
Tightens binary constraints w.r.t. every third variable
O(n³·d⁵)
✅ Catches dead ends AC-3 misses (e.g. odd cycles in 2-colourable graphs)
❌ Very expensive ❌ Often slower than just letting backtracking handle it
Tightly constrained small CSPs only
k-Consistency (general)
Every (k−1)-assignment extends to k-th
Exponential in k
✅ k=n ⇒ greedy assignment, no backtracking ever
❌ Cost grows explosively ❌ Impractical for k > 3
Mostly theoretical
Search-time pruning (during backtracking)
Algorithm
What it does at each assignment
Pros
Cons
When to use
No pruning (pure naïve BT)
Only check the new assignment against already-assigned vars
✅ Simplest, no bookkeeping
❌ Wastes huge effort on infeasible CSPs ❌ Discovers dead ends late
Toy problems only
Forward Checking (FC) ⭐
Remove the just-chosen value from every unassigned neighbour’s domain
✅ Cheap (one neighbour scan) ✅ Catches wipeouts one level earlier than naïve BT ✅ The single biggest practical win
❌ Only propagates one step from the just-assigned variable
The standard pruning during search
MAC (Maintain Arc Consistency)
Run a full AC-3 pass over the remaining unassigned subproblem
✅ Maximum pruning ✅ Often visits dramatically fewer nodes than FC
❌ Expensive per assignment ❌ Bookkeeping for undo is heavier
Hard CSPs where FC isn’t strong enough
Variable & value ordering heuristics
Heuristic
Purpose
Pros
Cons
MRV (Min Remaining Values)
Pick the variable with smallest current domain → fail-fast
✅ Discovers dead branches near the root → massively prunes search
❌ Requires current-domain info (works best with FC/MAC)
Degree heuristic
Tiebreak MRV by picking variable with most constraints on unassigned vars
✅ Free tiebreaker, no extra cost
❌ Standalone (without MRV) is weaker than MRV alone
LCV (Least Constraining Value)
Pick the value that rules out fewest neighbour options → fail-slow
✅ Improves first-found-solution time ✅ Pairs well with MRV (fail-fast on vars + fail-slow on values)
❌ Computing LCV costs O(d × neighbours) per variable ❌ Useless if the goal is to prove infeasibility (we have to exhaust anyway)
Search strategy comparison
Algorithm
Completeness
Optimality
Memory
Strength
Backtracking (naïve)
✅ Complete (will find a solution if one exists, or prove none)
n/a
O(n·d) — partial assignment + backtracking stack
Baseline
BT + FC
✅ Complete
n/a
O(n·d)
Practical default
BT + MAC + MRV + LCV
✅ Complete
n/a
O(n·d)
Industrial strength
Min-Conflicts (local search)
❌ Incomplete (cannot prove infeasibility)
n/a
O(n) — just the current assignment
Brilliant on huge CSPs (n-queens for n=1 M)
Three big trade-off axes
Axis
Cheap end
Expensive end
Rule of thumb
Pruning depth per node
naïve BT
MAC
Use FC by default; switch to MAC only if FC is too weak
Preprocessing strength
nothing
strong k-consistency
Node + AC-3 once. Stronger preprocessing rarely pays off.
Completeness
local search (incomplete)
systematic BT (complete)
Local search for huge “find one good solution” problems; BT when you need a guarantee
AC-3 — smart-queue arc consistency. The standard preprocessing.
PC-2 — path consistency. Use only on tight small CSPs.
k-Consistency — theoretical generalisation; rarely run in practice.
Forward Checking — prune the just-chosen value from neighbour domains. Cheap, huge win.
MAC — run AC-3 after every assignment. More pruning, more cost.
Backtracking — systematic depth-first variable assignment with undo. The search backbone.
MRV / Degree / LCV — pick next variable / value smartly.
Min-Conflicts — local search; flips one violating variable per step. Incomplete but blazing fast on huge problems.
Famous real CSP systems
CSPs are used everywhere “many variables have to satisfy many rules simultaneously” — usually hidden under industry names like “Optimization”, “Configurator”, or “Scheduling System”.
Airline crew scheduling — Sabre and IBM systems assign pilots and crew to flights, respecting union contracts, rest requirements, and contractual constraints. One of the largest CSPs in production (millions of variables).
NFL season scheduling — every NFL season is built as a CSP: thousands of constraints (no team plays 3 away games in a row, divisional games must happen, TV slots are fixed). IBM and Optym have built dedicated solvers for it.
University timetabling — UniTime (open-source, CSP-based) is deployed at hundreds of universities worldwide, including Purdue and Pittsburgh.
Sudoku solvers — the canonical pedagogical CSP. AC-3 + backtracking solves most puzzles in milliseconds.
AWS / cloud VM placement — AWS solves CSPs to place virtual machines on physical servers, respecting affinity rules, resource limits, and failure domains.
Configurator systems — when you spec a car on a manufacturer website, the “which options coexist” logic is a CSP. Renault and Volvo use IBM ILOG CPLEX-CP and SAP Configurator.
Sports scheduling in general — MLB, NBA, Tour de France, Bundesliga: all CSP under the hood.
Why CSPs survive: the formalism (variables, domains, constraints) intuitively maps every problem where rules must hold simultaneously. Plus: AC-3 preprocessing can shrink the search space by orders of magnitude before backtracking even starts.
Where CSP solvers were replaced — and by what
Domain
Was CSP solver, now …
Why
Pure boolean constraints (circuit verification, configuration)
Modern SAT solvers (MiniSat, Glucose, CaDiCaL) with CDCL (Conflict-Driven Clause Learning)
CDCL “learns” a new clause from every backtrack → shrinks the search space, while classical CSP backtracking is blind
CSPs with arithmetic or theories (bitvectors, strings)
SMT solvers: Z3, CVC5
SMT = SAT + theory solvers. Microsoft, Amazon, Google use Z3 for software verification, compiler optimization, smart-contract analysis
CSPs with linear constraints + optimization objective
MILP solvers: Gurobi, CPLEX, HiGHS
If your CSP has an objective to maximize, Integer Linear Programming is usually the better formulation
Early expert systems with “if-then” rules
Neural networks / LLMs
Where rule-based CSP-style systems were once used (classification, decision systems), ML models take over — they learn the rules from data instead of being specified
Some pure scheduling problems
Reinforcement Learning + heuristics
Google e.g. optimized data-center cooling with RL instead of classical CSP schedulers
Where classical CSP solvers remain: when the problem (a) is genuinely discrete and non-numerical, (b) a declarative specification is clearer than a learned model, and (c) explainability matters (NFL scheduling must be auditable — a neural network would be unacceptable). The boundary between “CSP”, “SAT”, and “SMT” is fluid today — many industry tools offer all three.