Chapter 4: Advanced Ensemble Techniques

4.1 Stacking (Stacked Generalization)

Stacking combines multiple models using meta-learner trained on base model predictions.

📐 Stacking Process: Level 0: Base models $\{f_1, f_2, ..., f_n\}$ Level 1: Meta-model $g$ learns from base predictions $$\hat{y} = g(f_1(x), f_2(x), ..., f_n(x))$$

💻 Code Example 1: Stacking Implementation

# Requirements:
# - Python 3.9+
# - numpy>=1.24.0, <2.0.0

import numpy as np
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.model_selection import cross_val_predict, train_test_split
from sklearn.metrics import accuracy_score

class StackingEnsemble:
    """Stacking ensemble implementation"""
    
    def __init__(self, base_models, meta_model, cv=5):
        self.base_models = base_models
        self.meta_model = meta_model
        self.cv = cv
    
    def fit(self, X_train, y_train):
        """Train stacking ensemble"""
        # Generate out-of-fold predictions for training meta-model
        meta_features = np.zeros((X_train.shape[0], len(self.base_models)))
        
        for i, model in enumerate(self.base_models):
            # Cross-validated predictions
            predictions = cross_val_predict(
                model, X_train, y_train,
                cv=self.cv, method='predict_proba'
            )
            meta_features[:, i] = predictions[:, 1]  # Probability of positive class
            
            # Train on full training set
            model.fit(X_train, y_train)
        
        # Train meta-model
        self.meta_model.fit(meta_features, y_train)
        return self
    
    def predict(self, X_test):
        """Make predictions using stacking ensemble"""
        # Get base model predictions
        meta_features = np.zeros((X_test.shape[0], len(self.base_models)))
        
        for i, model in enumerate(self.base_models):
            predictions = model.predict_proba(X_test)
            meta_features[:, i] = predictions[:, 1]
        
        # Meta-model prediction
        return self.meta_model.predict(meta_features)

# Example usage
from sklearn.datasets import load_breast_cancer

X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Define base models
base_models = [
    RandomForestClassifier(n_estimators=100, random_state=42),
    GradientBoostingClassifier(n_estimators=100, random_state=42),
    SVC(probability=True, random_state=42)
]

# Define meta-model
meta_model = LogisticRegression()

# Train stacking ensemble
stacking = StackingEnsemble(base_models, meta_model, cv=5)
stacking.fit(X_train, y_train)

# Evaluate
y_pred = stacking.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Stacking Ensemble Accuracy: {accuracy:.4f}")

4.2-4.7 More Advanced Topics

Blending, voting classifiers, model diversity, hyperparameter optimization, AutoML ensembles.

💻 Code Examples 2-7

# Blending implementation
# Soft and hard voting
# Measuring model diversity
# Multi-level stacking
# AutoML ensemble strategies
# Production deployment
# See full implementations in complete chapter

📝 Exercises

Implement 2-level stacking with diverse base models.
Compare stacking vs blending on same dataset.
Create voting ensemble and analyze soft vs hard voting.
Measure correlation between base model predictions.
Build AutoML-style ensemble with automated model selection.

Summary

Stacking: meta-model learns from base model predictions
Blending: simpler alternative to stacking with hold-out set
Voting: majority vote (hard) or average probabilities (soft)
Model diversity crucial for ensemble performance
Advanced ensembles often win Kaggle competitions
Trade-off: performance vs complexity and interpretability

← Ch3: LightGBM/CatBoost Overview →