Hyperparameter Tuning in Machine Learning

What is Hyperparameter Tuning?

In machine learning, a hyperparameter is a configuration that is set before the training process begins. These are not learned from the data but control the learning process itself. Examples include:

Number of neighbors in KNeighborsClassifier
Maximum depth of a decision tree
Learning rate in gradient boosting models

Hyperparameter tuning is the process of choosing the best combination of these settings to improve the performance of a model.

🔸 Why can't we just use default hyperparameters?

Default values may work fine, but they’re generic. Tuning helps you squeeze more accuracy from your model for your specific dataset.

🔹 Example 1: Tuning `KNeighborsClassifier` using GridSearchCV

Let’s tune the number of neighbors (n_neighbors) in a K-Nearest Neighbors classifier.

from sklearn.datasets import load_iris
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import GridSearchCV, train_test_split

# Load dataset
iris = load_iris()
X, y = iris.data, iris.target

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# KNN with GridSearch
param_grid = {
    'n_neighbors': [1, 3, 5, 7, 9]
}

grid_search = GridSearchCV(KNeighborsClassifier(), param_grid, cv=5)
grid_search.fit(X_train, y_train)

print("Best parameters:", grid_search.best_params_)
print("Best score:", grid_search.best_score_)

Best parameters: {'n_neighbors': 3}
Best score: 0.9619

Explanation

GridSearchCV tries every combination of the parameters you define.
cv=5 means 5-fold cross-validation. It splits the training data into 5 parts and evaluates each combination.
best_params_ gives you the best value of n_neighbors.

✦ Question: Why do we use cross-validation instead of testing on test data directly?

✧ Answer: Because test data should be untouched until final evaluation. Cross-validation ensures the model generalizes well before we use the test set.

🔹 Example 2: Tuning Decision Tree using multiple hyperparameters

In a decision tree, some important hyperparameters are max_depth and min_samples_split.

from sklearn.tree import DecisionTreeClassifier

param_grid = {
    'max_depth': [3, 5, 7, None],
    'min_samples_split': [2, 5, 10]
}

grid_search = GridSearchCV(DecisionTreeClassifier(), param_grid, cv=5)
grid_search.fit(X_train, y_train)

print("Best parameters:", grid_search.best_params_)
print("Best score:", grid_search.best_score_)

Best parameters: {'max_depth': 3, 'min_samples_split': 2}
Best score: 0.9428

Explanation

max_depth controls how deep the tree can go. Deeper trees may overfit.
min_samples_split controls the minimum number of samples required to split a node.
The best combination is selected based on cross-validation performance.

✦ Question: What happens if we don’t restrict the depth of a decision tree?

✧ Answer: The tree will grow deep and might overfit the training data, performing poorly on unseen data.

🔹 RandomizedSearchCV vs GridSearchCV

GridSearchCV tries all combinations exhaustively. This is great for small search spaces but becomes slow with many parameters.

RandomizedSearchCV tries only a fixed number of random combinations, making it faster.

🔸 Use Case: RandomizedSearchCV for Random Forest

from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import RandomizedSearchCV
import numpy as np

param_dist = {
    'n_estimators': [50, 100, 150, 200],
    'max_depth': [None, 10, 20, 30],
    'min_samples_split': [2, 5, 10]
}

random_search = RandomizedSearchCV(RandomForestClassifier(), param_distributions=param_dist, n_iter=5, cv=3, random_state=42)
random_search.fit(X_train, y_train)

print("Best parameters:", random_search.best_params_)
print("Best score:", random_search.best_score_)

Best parameters: {'n_estimators': 200, 'min_samples_split': 2, 'max_depth': 30}
Best score: 0.9619

Explanation

n_iter=5 means only 5 random combinations are tried.
Faster for large spaces, with only a slight risk of missing the exact best combo.

✦ Question: Should we always use RandomizedSearch for big models?

✧ Answer: Yes, especially when you have many hyperparameters. It’s efficient and provides near-optimal results faster than grid search.

Final Tips for Hyperparameter Tuning

Start with Grid Search for small models like KNN or Decision Trees
Use Randomized Search for complex models like Random Forest, XGBoost, etc.
Always combine with cross-validation to avoid overfitting
After tuning, evaluate on the unseen test set

Summary

Hyperparameter tuning is a critical step to improve your machine learning model’s performance. It helps find the optimal settings that generalize well on new data.

Mastering tools like GridSearchCV and RandomizedSearchCV will make your ML workflow robust and production-ready.

⬅ Previous TopicCross-Validation Techniques in Machine Learning (With Examples)

Next Topic ⮕Understanding Overfitting vs Underfitting in Machine Learning

Comments

Loading comments...

Hyperparameter Tuning in Machine Learning

What is Hyperparameter Tuning?

🔸 Why can't we just use default hyperparameters?

🔹 Example 1: Tuning KNeighborsClassifier using GridSearchCV

Explanation

✦ Question: Why do we use cross-validation instead of testing on test data directly?

🔹 Example 2: Tuning Decision Tree using multiple hyperparameters

Explanation

✦ Question: What happens if we don’t restrict the depth of a decision tree?

🔹 RandomizedSearchCV vs GridSearchCV

🔸 Use Case: RandomizedSearchCV for Random Forest

Explanation

✦ Question: Should we always use RandomizedSearch for big models?

Final Tips for Hyperparameter Tuning

Summary

Comments

Module 6: Improving Models❯

Player Settings

🔹 Example 1: Tuning `KNeighborsClassifier` using GridSearchCV