Quiz: Machine Learning I & II

Methods of AI — SoSe 2026

Q2 — ML

Question: How does the ID3 algorithm build a decision tree? What does it optimize?

Answer

ID3 greedily selects the attribute with highest Information Gain at each node.

Calculate entropy H(S) = −Σ pᵢ log₂(pᵢ) for the current set S.

For each attribute A, calculate Gain(S,A) = H(S) − Σ_{v∈Values(A)} (|Sᵥ|/|S|)·H(Sᵥ)

Choose the attribute with highest gain → creates a branch.

Recurse on each subset Sᵥ until all examples have same class (entropy = 0) or no attributes left.
It looks for the shortest decision tree (highest info gain first = fewest needed splits). Can overfit — deep trees memorize training noise.

Max’s answer:
Result:

Q3 — ML

Question: Why are Random Forests often better than a single decision tree? What is bagging?

Answer

Random Forests train many decision trees on random subsets of training data (bootstrap samples) and random subsets of features at each split. Prediction = majority vote.
Why better: individual trees are accurate but correlated if trained on same data. Random subsampling ensures diversity — trees make different errors. Averaging diverse errors reduces variance (overfitting) while keeping low bias.
Bagging: bootstrap aggregation — each tree is trained on a random sample (with replacement) of the training data. Trees train independently and vote.
Random Forest = Bagging + random feature selection at each split.

Max’s answer:
Result:

Q4 — ML

Question: What is entropy in the context of decision trees? When is entropy maximized and minimized?

Answer

Entropy H(S) = −Σ pᵢ log₂(pᵢ) measures the impurity of a set S.

Minimum (= 0): all examples in S have the same class label → pᵢ = 1 for one class, 0 for all others → −1·log₂(1) = 0. Pure node.

Maximum: classes are equally distributed → most uncertainty. For binary case: H = 1 when p = 0.5 (50/50 split). For k classes: max H = log₂(k).
Information gain = reduction in entropy after splitting on an attribute. Higher gain = better split.

Max’s answer:
Result:

Q5 — ML

Question: What are the three main types of learning? Give one concrete algorithm for each.

Answer

Supervised learning: labeled training data (input → correct output). Goal: learn a function to predict output for new inputs.
Algorithm: decision tree (ID3), SVM, perceptron.

Unsupervised learning: no labels. Goal: find hidden structure/patterns in data.
Algorithm: k-means clustering, hierarchical clustering.

Reinforcement learning: agent interacts with environment, receives rewards/penalties. Goal: learn policy to maximize long-term reward.
Algorithm: Q-learning, policy iteration for MDPs.
(Also: semi-supervised, self-supervised — but these 3 are the classical trio.)

Max’s answer:
Result:

Q6 — ML

Question: What is the inductive bias of a learning algorithm? Why is it necessary?

Answer

The inductive bias is the set of prior assumptions a learner makes to generalize from training examples to unseen data.
Why necessary: without any assumptions, a learner could only “memorize” examples — no generalization is possible. Any generalization beyond the training set requires assuming that patterns will hold.
Formally: B is the inductive bias if: for all x∈U, B ∧ D ∧ x ⊨ L(x,D) — the bias plus training data logically entails the prediction.
Example: “occam’s razor” (prefer simpler models) is an inductive bias — simple boundaries generalize better.

Max’s answer:
Result:

Q7 — ML

Question: Describe k-means clustering. What does it optimize, and what is its main limitation?

Answer

K-means:

Initialize k cluster centroids (randomly or with k-means++)

Assign each data point to nearest centroid

Recompute centroids as means of assigned points

Repeat steps 2-3 until assignments don’t change
Optimizes: minimizes within-cluster sum of squared distances: E = (1/|D|) Σⱼ ‖xⱼ − w_{m(xⱼ)}‖²
Each step guarantees E doesn’t increase. Converges to a local minimum.
Main limitation: depends heavily on initialization — different random starts → different results. Only finds local optima, not global minimum. Also assumes spherical clusters (Euclidean distance).

Max’s answer:
Result:

Beyond the lecture (optional)

These questions go beyond the SoSe 2026 lecture slides (textbook / external additions). Kept for depth, not exam-critical.

Q1 — ML

Question: What is the bias-variance tradeoff? How does model complexity affect each?

Answer

Bias: systematic error from wrong assumptions. High-bias models are too simple (underfitting) — e.g. fitting a straight line to curved data. Error is consistent and predictable.

Variance: sensitivity to small fluctuations in training data. High-variance models overfit — they memorize noise. Error changes a lot with different training sets.
Total error ≈ bias² + variance + irreducible noise.

Increasing complexity → decreases bias, increases variance.

Decreasing complexity → increases bias, decreases variance.
Goal: find the sweet spot. Cross-validation helps estimate it.

Max’s answer:
Result:

Score

Total: / 7

Brain Online

Explorer

quiz_ml_30-04-26

Quiz: Machine Learning I & II

Q2 — ML

Q3 — ML

Q4 — ML

Q5 — ML

Q6 — ML

Q7 — ML

Beyond the lecture (optional)

Q1 — ML

Score

Backlinks

Mika

✨ Features

⚙️ Einstellungen

📚 Chat-Verlauf

📖 Citation Manager

✍️ Writing Assistant

Inhaltsverzeichnis