Bias-Variance Tradeoff in Machine Learning (with Python Examples)

What is Bias-Variance Tradeoff?

In machine learning, we aim to build models that generalize well — that is, perform well on new, unseen data. However, every model is a balancing act between two competing sources of error:

Bias – Error due to overly simplistic assumptions in the model.
Variance – Error due to too much complexity in the model, leading it to learn noise in the training data.

Bias-Variance Tradeoff is the balance we must strike between underfitting (high bias) and overfitting (high variance).

Example 1: Underfitting (High Bias)

Imagine trying to fit a straight line to data that actually forms a curve. The model is too simple to capture the real pattern.

import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression

# Generating curved data
np.random.seed(1)
X = np.linspace(0, 10, 100).reshape(-1, 1)
y = np.sin(X).ravel() + np.random.normal(0, 0.1, X.shape[0])

# Fit a linear regression model
model = LinearRegression()
model.fit(X, y)
y_pred = model.predict(X)

# Plotting
plt.scatter(X, y, label="True Data", s=10)
plt.plot(X, y_pred, color='red', label="Linear Fit")
plt.title("Underfitting: High Bias")
plt.legend()
plt.show()

This plot will show a straight red line trying to approximate sinusoidal data — clearly a bad fit.

Explanation:

Linear regression can't capture the curve, so the predictions are far off. The model is underfitting — it has high bias and can't capture the underlying data pattern.

Why does this happen?

➤ The model is too simple for the data. It assumes a straight-line relationship, which doesn’t exist here.

What will happen if we try this model on test data?

➤ It will perform poorly both on training and test data, because it never learned the true pattern in the first place.

Example 2: Overfitting (High Variance)

Now let’s fit a model that’s too complex — like a polynomial of degree 15 — to the same data.

from sklearn.preprocessing import PolynomialFeatures
from sklearn.pipeline import make_pipeline

# High-degree polynomial regression (degree 15)
high_variance_model = make_pipeline(PolynomialFeatures(degree=15), LinearRegression())
high_variance_model.fit(X, y)
y_pred_poly = high_variance_model.predict(X)

# Plotting
plt.scatter(X, y, label="True Data", s=10)
plt.plot(X, y_pred_poly, color='green', label="Polynomial Degree 15")
plt.title("Overfitting: High Variance")
plt.legend()
plt.show()

This plot shows a wiggly green line that tries too hard to match every point — even the noise.

Explanation:

This model performs extremely well on training data but fails on test data. It memorizes the noise, which leads to poor generalization — this is high variance.

Why is overfitting dangerous?

➤ Because your model learns noise as if it were signal, which reduces accuracy on real-world data.

How can we detect overfitting?

➤ Huge gap between training and test accuracy. Training score is high, but test score is low.

Example 3: Just Right (Balanced Bias and Variance)

Let’s now try a moderately complex model — a polynomial of degree 3 — and see how it performs.

# Polynomial regression with degree 3
just_right_model = make_pipeline(PolynomialFeatures(degree=3), LinearRegression())
just_right_model.fit(X, y)
y_pred_just_right = just_right_model.predict(X)

# Plotting
plt.scatter(X, y, label="True Data", s=10)
plt.plot(X, y_pred_just_right, color='purple', label="Polynomial Degree 3")
plt.title("Balanced Fit")
plt.legend()
plt.show()

This plot shows a smooth purple curve that captures the main pattern of the data without overreacting to noise.

Explanation:

Polynomial degree 3 captures the true sinusoidal trend without going overboard. This is an example of a model with low bias and low variance — the sweet spot.

Bias-Variance Tradeoff Curve

Visually, the tradeoff looks like this:

Bias decreases with increasing model complexity
Variance increases with increasing model complexity

Too simple = high bias, too complex = high variance.

Summary Table

Type	Bias	Variance	Example
Underfitting	High	Low	Linear Regression on curved data
Overfitting	Low	High	Polynomial (degree 15) regression
Balanced	Low	Low	Polynomial (degree 3) regression

Final Thought

Your goal is to minimize total error — which includes both bias and variance. This often requires:

Trying different models
Using validation scores
Regularization (like Ridge/Lasso)
Cross-validation

In short: Not too simple, not too complex — just right.

⬅ Previous TopicUnderstanding Overfitting vs Underfitting in Machine Learning

Next Topic ⮕Titanic Survival Prediction Using Machine Learning

Bias-Variance Tradeoff in Machine Learning (with Python Examples)

What is Bias-Variance Tradeoff?

Example 1: Underfitting (High Bias)

Explanation:

Example 2: Overfitting (High Variance)

Explanation:

Example 3: Just Right (Balanced Bias and Variance)

Explanation:

Bias-Variance Tradeoff Curve

Summary Table

Final Thought

Module 6: Improving Models❯

Welcome to ProgramGuru

Player Settings