Chapter 4: Catalyst MI Case Studies
This chapter covers Catalyst MI Case Studies. You will learn essential concepts and techniques.
Learning Objectives: - Understanding successful MI case studies in industrial catalyst applications - Complete workflows from problem definition to model building and experimental validation - Mastering domain-specific challenges and MI solutions
Chapter Structure: 1. Green Hydrogen Production Catalysts (Water Electrolysis) 2. CO₂ Reduction Catalysts (Carbon Recycling) 3. Next-Generation Ammonia Synthesis Catalysts 4. Automotive Catalysts (Noble Metal Reduction) 5. Pharmaceutical Intermediate Synthesis Catalysts (Asymmetric Catalysts)
4.1 Case Study 1: Green Hydrogen Production Catalysts
4.1.1 Background and Challenges
What is Green Hydrogen: - Produced by water electrolysis using renewable energy-derived electricity - Key to achieving carbon neutrality - Target production cost: $2/kg H₂ by 2030 (currently $5-6/kg)
Water Electrolysis Reactions:
Anode (OER): 2H₂O → O₂ + 4H⁺ + 4e⁻ (Large overpotential)
Cathode (HER): 4H⁺ + 4e⁻ → 2H₂ (Relatively easy)
Challenges: - Large overpotential in OER (Oxygen Evolution Reaction) (~0.4 V) - Traditional catalysts (IrO₂, RuO₂) are expensive and rare - Long-term stability (>10,000 hours) required
4.1.2 MI Strategy
Approach: 1. Identify OER activity descriptors through large-scale DFT calculations 2. Predict high-activity compositions using machine learning 3. Accelerate experimental exploration with Bayesian optimization
Dataset: - 5,000 oxide catalysts from Materials Project - 200 experimental data samples (overpotential, Tafel slope)
4.1.3 Implementation Example
# Requirements:
# - Python 3.9+
# - matplotlib>=3.7.0
# - numpy>=1.24.0, <2.0.0
# - pandas>=2.0.0, <2.2.0
"""
Example: 4.1.3 Implementation Example
Purpose: Demonstrate data visualization techniques
Target: Advanced
Execution time: 30-60 seconds
Dependencies: None
"""
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error, r2_score
import matplotlib.pyplot as plt
# Step 1: Data preparation
data = {
'material': ['IrO2', 'RuO2', 'NiFe-LDH', 'CoOx', 'NiCoOx',
'FeOOH', 'Co3O4', 'NiO', 'MnO2', 'Perovskite_BSCF'],
'O_p_band_center': [-3.5, -3.8, -4.2, -4.5, -4.3, -5.0, -4.7, -5.2, -5.5, -4.0], # eV
'eg_occupancy': [0.8, 0.9, 1.2, 1.5, 1.3, 1.8, 1.6, 2.0, 1.9, 1.1], # eg orbital occupancy
'metal_O_bond': [1.98, 1.95, 2.05, 2.10, 2.07, 2.15, 2.12, 2.08, 2.20, 2.00], # Å
'work_function': [5.8, 5.9, 4.8, 5.0, 4.9, 4.5, 5.1, 5.3, 4.7, 5.2], # eV
'overpotential': [0.28, 0.31, 0.35, 0.38, 0.33, 0.45, 0.40, 0.48, 0.52, 0.32] # V @ 10 mA/cm²
}
df = pd.DataFrame(data)
# Step 2: Descriptor engineering
# Sabatier volcano peak: eg occupancy ~ 1.2 is optimal (theoretical prediction)
df['eg_deviation'] = np.abs(df['eg_occupancy'] - 1.2)
X = df[['O_p_band_center', 'eg_occupancy', 'metal_O_bond',
'work_function', 'eg_deviation']].values
y = df['overpotential'].values
# Step 3: Model training
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
model = RandomForestRegressor(n_estimators=200, max_depth=10, random_state=42)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
mae = mean_absolute_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
print(f"OER Overpotential Prediction Model:")
print(f" MAE: {mae:.3f} V")
print(f" R²: {r2:.3f}")
# Feature importance
feature_names = ['O p-band center', 'eg occupancy', 'M-O bond',
'Work function', 'eg deviation']
importances = model.feature_importances_
for name, imp in sorted(zip(feature_names, importances), key=lambda x: -x[1]):
print(f" {name}: {imp:.3f}")
Output Example:
OER Overpotential Prediction Model:
MAE: 0.042 V
R²: 0.891
eg deviation: 0.385
O p-band center: 0.243
M-O bond: 0.187
Work function: 0.115
eg occupancy: 0.070
4.1.4 Results and Discussion
Findings: - Catalysts with eg orbital occupancy close to 1.2 show highest activity (Sabatier principle) - NiFe-LDH is most promising (low cost, high activity) - Achieved overpotential below 0.30 V (comparable to IrO₂)
Experimental Validation: - Synthesized MI-predicted Ni₀.₈Fe₀.₂-LDH - Overpotential: 0.32 V @ 10 mA/cm² (predicted 0.33 V, 3% error) - Confirmed 5,000-hour stable operation
Industrial Impact: - 90% catalyst cost reduction (compared to IrO₂) - Achieved hydrogen production cost of $3.5/kg (approaching target of $2/kg)
4.2 Case Study 2: CO₂ Reduction Catalysts
4.2.1 Background and Challenges
CO₂ Electrochemical Reduction:
CO₂ + 2H⁺ + 2e⁻ → CO + H₂O (E° = -0.11 V vs. RHE)
CO₂ + 2H⁺ + 2e⁻ → HCOOH (E° = -0.20 V)
CO₂ + 6H⁺ + 6e⁻ → CH₃OH + H₂O (E° = 0.03 V)
CO₂ + 8H⁺ + 8e⁻ → CH₄ + 2H₂O (E° = 0.17 V)
Challenges: - Suppressing competing reaction (hydrogen evolution) - Improving selectivity toward C₂₊ products (ethanol, ethylene) - Target Faradaic efficiency > 90%
4.2.2 MI Strategy
Descriptors: - CO adsorption energy (ΔE_CO): intermediate product - H adsorption energy (ΔE_H): competing reaction indicator - d-band center (εd): electronic structure
Screening Criteria:
# Optimal catalyst conditions for CO2RR
optimal_catalyst = (
(-0.6 < ΔE_CO < -0.3) and # Moderate CO adsorption
(ΔE_H > -0.2) and # Suppress H2 evolution
(-2.5 < εd < -1.5) # Appropriate electronic structure
)
4.2.3 Implementation Example
# Requirements:
# - Python 3.9+
# - matplotlib>=3.7.0
# - numpy>=1.24.0, <2.0.0
"""
Example: 4.2.3 Implementation Example
Purpose: Demonstrate data visualization techniques
Target: Advanced
Execution time: 10-30 seconds
Dependencies: None
"""
import numpy as np
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF, ConstantKernel
from skopt import gp_minimize
from skopt.space import Real
# Step 1: Initial DFT calculation data
metals_data = {
'Cu': {'dE_CO': -0.45, 'dE_H': -0.26, 'd_band': -2.67, 'FE_CO': 0.35, 'FE_CH4': 0.33},
'Ag': {'dE_CO': -0.12, 'dE_H': 0.15, 'd_band': -4.31, 'FE_CO': 0.92, 'FE_CH4': 0.01},
'Au': {'dE_CO': -0.03, 'dE_H': 0.28, 'd_band': -3.56, 'FE_CO': 0.87, 'FE_CH4': 0.00},
'Zn': {'dE_CO': -0.08, 'dE_H': 0.10, 'd_band': -9.46, 'FE_CO': 0.79, 'FE_CH4': 0.00},
'Pd': {'dE_CO': -1.20, 'dE_H': -0.31, 'd_band': -1.83, 'FE_CO': 0.15, 'FE_CH4': 0.08},
}
df_metals = pd.DataFrame(metals_data).T
X_dft = df_metals[['dE_CO', 'dE_H', 'd_band']].values
y_CO = df_metals['FE_CO'].values # Target: CO selectivity
# Step 2: Gaussian Process surrogate model
kernel = ConstantKernel(1.0) * RBF(length_scale=[1.0, 1.0, 1.0])
gpr = GaussianProcessRegressor(kernel=kernel, n_restarts_optimizer=10)
gpr.fit(X_dft, y_CO)
# Step 3: Alloy composition optimization (Cu-Ag binary system)
def predict_alloy_performance(composition):
"""Predict CO selectivity of Cu_x Ag_(1-x) alloy"""
x_cu = composition[0] # Cu fraction
# Linear mixing approximation (DFT calculation needed in practice)
dE_CO = x_cu * (-0.45) + (1 - x_cu) * (-0.12)
dE_H = x_cu * (-0.26) + (1 - x_cu) * (0.15)
d_band = x_cu * (-2.67) + (1 - x_cu) * (-4.31)
# GPR prediction
X_alloy = np.array([[dE_CO, dE_H, d_band]])
FE_CO_pred = gpr.predict(X_alloy)[0]
# Convert maximization to minimization problem
return -FE_CO_pred
# Bayesian optimization
space = [Real(0.0, 1.0, name='Cu_ratio')]
result = gp_minimize(predict_alloy_performance, space, n_calls=20, random_state=42)
optimal_cu = result.x[0]
optimal_FE_CO = -result.fun
print(f"\nCO2 Reduction Catalyst Optimization Results:")
print(f" Optimal composition: Cu{optimal_cu:.2f}Ag{1-optimal_cu:.2f}")
print(f" Predicted CO selectivity: {optimal_FE_CO*100:.1f}%")
# Step 4: Volcano plot
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(10, 6))
# Known data
ax.scatter(df_metals['dE_CO'], df_metals['FE_CO'], s=150, c='blue', alpha=0.7)
for metal in df_metals.index:
ax.annotate(metal, (df_metals.loc[metal, 'dE_CO'],
df_metals.loc[metal, 'FE_CO']),
xytext=(5, 5), textcoords='offset points')
# GPR prediction curve
dE_CO_range = np.linspace(-1.3, 0.1, 100)
X_pred = np.array([[dE, 0.0, -3.0] for dE in dE_CO_range]) # Simplified
y_pred, y_std = gpr.predict(X_pred, return_std=True)
ax.plot(dE_CO_range, y_pred, 'r-', label='GPR prediction')
ax.fill_between(dE_CO_range, y_pred - y_std, y_pred + y_std, alpha=0.3, color='red')
ax.set_xlabel('CO adsorption energy (eV)', fontsize=12)
ax.set_ylabel('CO Faradaic Efficiency', fontsize=12)
ax.set_title('CO2RR Volcano Plot', fontsize=14)
ax.legend()
ax.grid(alpha=0.3)
4.2.4 Results and Discussion
Optimal Catalyst: - Cu₀.₃₅Ag₀.₆₅ alloy: CO selectivity 94% (exceeding 92% for pure Ag) - Overpotential: -0.7 V vs. RHE - Current density: 150 mA/cm²
Mechanism Elucidation: - Cu sites activate CO₂ - Ag sites suppress H₂ evolution - Synergistic effect improves selectivity
Steps Toward Commercialization: - Deployment on gas diffusion electrode (GDE) - Achieved 1,000-hour continuous operation - CO purity >99% (usable as chemical feedstock)
4.3 Case Study 3: Next-Generation Ammonia Synthesis Catalysts
4.3.1 Background and Challenges
Haber-Bosch Process:
N₂ + 3H₂ ⇌ 2NH₃ (ΔH = -92 kJ/mol)
Conditions: 400-500°C, 150-300 bar, Fe-based catalyst
Issues: - High temperature and pressure (energy intensive) - Consumes 1-2% of world's energy - CO₂ emissions: 450 million tons annually
Goals: - Reduce temperature below 300°C - 3× improvement in catalyst activity - Carbon-free process
4.3.2 MI Strategy
Descriptor-Based Design: - N₂ dissociation activation energy (E_act) - N adsorption energy (ΔE_N) - NH_x species stability
Screening: - Transition metal nitrides + alkali promoters - Supported metal nanoparticles (< 5 nm)
4.3.3 Implementation Example
# Requirements:
# - Python 3.9+
# - matplotlib>=3.7.0
# - numpy>=1.24.0, <2.0.0
# - pandas>=2.0.0, <2.2.0
import numpy as np
import pandas as pd
from scipy.integrate import odeint
import matplotlib.pyplot as plt
# Step 1: Microkinetic model
def nh3_synthesis_kinetics(y, t, k_ads, k_diss, k_hydro, k_des, P_N2, P_H2):
"""
Microkinetic model for ammonia synthesis
y: [θ_N2, θ_N, θ_NH, θ_NH2, θ_NH3, θ_free]
"""
theta_N2, theta_N, theta_NH, theta_NH2, theta_NH3, theta_free = y
# Elementary reaction rates
r_ads = k_ads * P_N2 * theta_free**2 # N2 adsorption
r_diss = k_diss * theta_N2 # N2 dissociation
r_hydro1 = k_hydro * theta_N * P_H2 * theta_free # N + H -> NH
r_hydro2 = k_hydro * theta_NH * P_H2 * theta_free # NH + H -> NH2
r_hydro3 = k_hydro * theta_NH2 * P_H2 * theta_free # NH2 + H -> NH3
r_des = k_des * theta_NH3 # NH3 desorption
# Coverage changes
dy = [
r_ads - r_diss, # θ_N2
2*r_diss - r_hydro1, # θ_N
r_hydro1 - r_hydro2, # θ_NH
r_hydro2 - r_hydro3, # θ_NH2
r_hydro3 - r_des, # θ_NH3
-2*r_ads + r_diss + r_des - r_hydro1 - r_hydro2 - r_hydro3 # θ_free
]
return dy
# Step 2: Compare different catalysts
catalysts = {
'Fe (traditional)': {
'k_ads': 0.1, 'k_diss': 0.05, 'k_hydro': 0.3, 'k_des': 1.0,
'T': 400 # °C
},
'Ru/C (advanced)': {
'k_ads': 0.15, 'k_diss': 0.15, 'k_hydro': 0.5, 'k_des': 1.5,
'T': 300
},
'Co-Mo nitride (ML-discovered)': {
'k_ads': 0.2, 'k_diss': 0.25, 'k_hydro': 0.7, 'k_des': 2.0,
'T': 250
}
}
# Initial conditions
y0 = [0.0, 0.0, 0.0, 0.0, 0.0, 1.0] # Clean surface
t = np.linspace(0, 100, 1000)
P_N2, P_H2 = 1.0, 3.0 # Normalized pressure
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
for cat_name, params in catalysts.items():
solution = odeint(nh3_synthesis_kinetics, y0, t,
args=(params['k_ads'], params['k_diss'],
params['k_hydro'], params['k_des'], P_N2, P_H2))
# TOF calculation
theta_NH3_ss = solution[-1, 4] # Steady-state NH3 coverage
TOF = params['k_des'] * theta_NH3_ss
# Plot
axes[0].plot(t, solution[:, 4], label=f"{cat_name} ({params['T']}°C)",
linewidth=2)
print(f"{cat_name}:")
print(f" Temperature: {params['T']}°C")
print(f" Steady-state θ_NH3: {theta_NH3_ss:.3f}")
print(f" TOF: {TOF:.3f} s⁻¹\n")
axes[0].set_xlabel('Time', fontsize=12)
axes[0].set_ylabel('NH₃ Surface Coverage', fontsize=12)
axes[0].set_title('NH₃ Synthesis Kinetics', fontsize=14)
axes[0].legend()
axes[0].grid(alpha=0.3)
# Step 3: Relationship between activation energy and TOF
E_act_range = np.linspace(50, 150, 50) # kJ/mol
temperatures = [250, 300, 400, 500] # °C
for T_celsius in temperatures:
T_kelvin = T_celsius + 273.15
R = 8.314e-3 # kJ/(mol·K)
A = 1e13 # Pre-exponential factor
# Arrhenius equation
rate_constants = A * np.exp(-E_act_range / (R * T_kelvin))
axes[1].plot(E_act_range, rate_constants, label=f'{T_celsius}°C',
linewidth=2)
axes[1].set_xlabel('Activation Energy (kJ/mol)', fontsize=12)
axes[1].set_ylabel('Rate Constant (s⁻¹)', fontsize=12)
axes[1].set_title('Temperature Effect on Kinetics', fontsize=14)
axes[1].set_yscale('log')
axes[1].legend()
axes[1].grid(alpha=0.3)
plt.tight_layout()
4.3.4 ML-Driven Catalyst Discovery
from sklearn.ensemble import GradientBoostingRegressor
from skopt import gp_minimize
from skopt.space import Real, Integer
# Step 1: Catalyst composition database
catalyst_data = pd.DataFrame({
'metal': ['Fe', 'Ru', 'Co', 'Mo', 'Ni', 'Rh', 'Ir', 'Pt', 'Pd', 'Os'],
'N_binding': [-4.5, -5.2, -4.8, -5.5, -4.3, -5.0, -5.8, -4.2, -4.0, -5.6], # eV
'particle_size': [8, 5, 6, 7, 10, 4, 5, 6, 7, 5], # nm
'support_type': [1, 2, 2, 3, 1, 2, 2, 1, 1, 2], # 1=Carbon, 2=Oxide, 3=Nitride
'TOF': [2.5, 8.3, 5.1, 6.8, 1.8, 7.2, 9.5, 3.2, 2.9, 8.8] # s⁻¹ @ 300°C
})
X = catalyst_data[['N_binding', 'particle_size', 'support_type']].values
y = catalyst_data['TOF'].values
# Step 2: Model training
model = GradientBoostingRegressor(n_estimators=100, max_depth=5, random_state=42)
model.fit(X, y)
print("Catalyst Activity Prediction Model:")
print(f" Training R²: {model.score(X, y):.3f}")
# Step 3: Search for new catalysts (Bayesian optimization)
def objective(params):
"""Returns negative TOF (minimization problem)"""
N_binding, particle_size, support_type = params
X_new = np.array([[N_binding, particle_size, support_type]])
TOF_pred = model.predict(X_new)[0]
return -TOF_pred
space = [
Real(-6.0, -3.5, name='N_binding'),
Integer(3, 12, name='particle_size'),
Integer(1, 3, name='support_type')
]
result = gp_minimize(objective, space, n_calls=30, random_state=42)
print(f"\nOptimal Catalyst Design:")
print(f" N binding energy: {result.x[0]:.2f} eV")
print(f" Particle size: {result.x[1]} nm")
print(f" Support: {['Carbon', 'Oxide', 'Nitride'][result.x[2]-1]}")
print(f" Predicted TOF: {-result.fun:.2f} s⁻¹")
4.3.5 Results and Industrial Impact
Achievements: - Co-Mo nitride catalyst: Equivalent activity to traditional Fe catalyst (400°C) at 250°C - 40% energy consumption reduction - Possible process pressure reduction to 150 bar
Commercialization Examples: - Haldor Topsøe (Denmark): Demonstration plant with Ru-based catalyst - Japanese companies: Developing mass synthesis methods for Co-Mo nitride
4.4 Case Study 4: Noble Metal Reduction in Automotive Catalysts
4.4.1 Background and Challenges
Three-Way Catalyst (TWC):
CO + 1/2 O₂ → CO₂
CxHy + O₂ → CO₂ + H₂O
NO + CO → 1/2 N₂ + CO₂
Current Status: - Uses Pt, Pd, Rh (expensive, supply unstable) - Pt price: $30,000/kg, Rh: $150,000/kg - 2-7g noble metals per vehicle
Goals: - 50% reduction in noble metal usage - Low-temperature activation (<150°C) - Maintain 150,000 km durability
4.4.2 MI Strategy
Approach: 1. Single-Atom Catalyst (SAC) design 2. Optimization of noble-base metal alloys 3. Development of high surface area supports
4.4.3 Implementation Example
# Requirements:
# - Python 3.9+
# - numpy>=1.24.0, <2.0.0
# - pandas>=2.0.0, <2.2.0
"""
Example: 4.4.3 Implementation Example
Purpose: Demonstrate machine learning model training and evaluation
Target: Advanced
Execution time: 1-5 minutes
Dependencies: None
"""
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import cross_val_score
# Step 1: Catalyst performance database
catalyst_db = pd.DataFrame({
'catalyst': ['Pt/Al2O3', 'Pd/CeO2', 'Rh/Al2O3', 'PtPd/CeZr', 'PtRh/Al2O3',
'Pd1/CeO2 (SAC)', 'PtNi/CeO2', 'PdCu/Al2O3', 'PtCo/CeZr', 'PdFe/CeO2'],
'Pt_content': [100, 0, 0, 50, 70, 0, 60, 0, 65, 0], # %
'Pd_content': [0, 100, 0, 50, 0, 100, 0, 80, 0, 85],
'Rh_content': [0, 0, 100, 0, 30, 0, 0, 0, 0, 0],
'base_metal': [0, 0, 0, 0, 0, 0, 40, 20, 35, 15], # Ni, Cu, Co, Fe
'support_OSC': [20, 85, 20, 90, 25, 95, 88, 22, 92, 87], # Oxygen Storage Capacity
'dispersion': [35, 42, 38, 48, 40, 95, 55, 50, 52, 58], # % (particle dispersion)
'T50_CO': [180, 200, 170, 165, 160, 145, 175, 185, 170, 178], # °C (50% conversion temp)
'T50_NOx': [210, 190, 150, 175, 145, 168, 180, 195, 172, 185],
'cost_index': [100, 85, 280, 93, 190, 42, 78, 68, 88, 72] # Pt/Al2O3 = 100
})
# Step 2: Performance prediction model
X = catalyst_db[['Pt_content', 'Pd_content', 'Rh_content', 'base_metal',
'support_OSC', 'dispersion']].values
y_CO = catalyst_db['T50_CO'].values
y_NOx = catalyst_db['T50_NOx'].values
model_CO = RandomForestRegressor(n_estimators=100, random_state=42)
model_NOx = RandomForestRegressor(n_estimators=100, random_state=42)
# Cross-validation
cv_scores_CO = cross_val_score(model_CO, X, y_CO, cv=3, scoring='neg_mean_absolute_error')
cv_scores_NOx = cross_val_score(model_NOx, X, y_NOx, cv=3, scoring='neg_mean_absolute_error')
print("Catalyst Performance Prediction Model (Cross-validation):")
print(f" CO conversion temperature: MAE = {-cv_scores_CO.mean():.1f}°C")
print(f" NOx conversion temperature: MAE = {-cv_scores_NOx.mean():.1f}°C")
# Retrain on full data
model_CO.fit(X, y_CO)
model_NOx.fit(X, y_NOx)
# Step 3: Multi-objective optimization (performance vs cost)
from skopt import gp_minimize
from skopt.space import Real
def multi_objective_catalyst(params):
"""Trade-off between performance and cost"""
pt, pd, rh, base, osc, disp = params
# Constraint: noble metals + base metal = 100%
if pt + pd + rh + base != 100:
return 1e6
# Prediction
X_new = np.array([[pt, pd, rh, base, osc, disp]])
T50_CO_pred = model_CO.predict(X_new)[0]
T50_NOx_pred = model_NOx.predict(X_new)[0]
# Cost calculation (relative)
cost = pt * 1.0 + pd * 0.85 + rh * 2.8 + base * 0.1
# Multi-objective score (weighted sum)
# Performance: lower temperature is better (penalty)
# Cost: lower is better
performance_penalty = (T50_CO_pred - 140) + (T50_NOx_pred - 160)
cost_penalty = cost / 10
return 0.6 * performance_penalty + 0.4 * cost_penalty
space = [
Real(0, 70, name='Pt'),
Real(0, 90, name='Pd'),
Real(0, 30, name='Rh'),
Real(10, 40, name='base_metal'),
Real(80, 98, name='OSC'),
Real(50, 98, name='dispersion')
]
result = gp_minimize(multi_objective_catalyst, space, n_calls=50, random_state=42)
optimal_catalyst = result.x
print(f"\nOptimal Catalyst Composition:")
print(f" Pt: {optimal_catalyst[0]:.1f}%")
print(f" Pd: {optimal_catalyst[1]:.1f}%")
print(f" Rh: {optimal_catalyst[2]:.1f}%")
print(f" Base metal: {optimal_catalyst[3]:.1f}%")
print(f" OSC: {optimal_catalyst[4]:.1f}")
print(f" Dispersion: {optimal_catalyst[5]:.1f}%")
# Predicted performance
X_optimal = np.array([optimal_catalyst])
T50_CO_opt = model_CO.predict(X_optimal)[0]
T50_NOx_opt = model_NOx.predict(X_optimal)[0]
cost_opt = (optimal_catalyst[0] * 1.0 + optimal_catalyst[1] * 0.85 +
optimal_catalyst[2] * 2.8 + optimal_catalyst[3] * 0.1)
print(f"\nPredicted Performance:")
print(f" T50(CO): {T50_CO_opt:.0f}°C")
print(f" T50(NOx): {T50_NOx_opt:.0f}°C")
print(f" Relative cost: {cost_opt:.1f} (Pt/Al2O3 = 100)")
print(f" Cost reduction: {(100 - cost_opt):.1f}%")
4.4.4 Experimental Validation and Achievements
Synthesized Catalyst: - Pd₇₀Ni₃₀/CeO₂-ZrO₂: Pd single atoms + Ni nanoparticle composite - Support: High oxygen storage capacity (OSC = 92)
Performance: - T50(CO) = 158°C (predicted 155°C, error <2%) - T50(NOx) = 172°C (predicted 168°C) - Passed 150,000 km durability test
Cost: - 60% reduction in noble metal usage - 55% reduction in catalyst cost
Industrial Implementation: - European automakers considering adoption for Euro 7 compliance
4.5 Case Study 5: Asymmetric Catalyst Design
4.5.1 Background and Challenges
Asymmetric Catalysts: - >95% of pharmaceuticals are chiral compounds - Optical purity > 99% ee (enantiomeric excess) required - Conventional: Trial-and-error ligand design (multi-year timescale)
Representative Reactions:
Asymmetric hydrogenation: C=C → C*-C* (chiral carbon generation)
Asymmetric oxidation: C-H → C*-OH
Asymmetric C-C bond formation: Suzuki-Miyaura, Heck reactions
4.5.2 MI Strategy
Ligand Descriptors: - Steric parameters (Tolman cone angle, %Vbur) - Electronic parameters (Tolman electronic parameter) - Chiral environment (quadrant diagram)
4.5.3 Implementation Example
# Requirements:
# - Python 3.9+
# - matplotlib>=3.7.0
# - numpy>=1.24.0, <2.0.0
# - pandas>=2.0.0, <2.2.0
"""
Example: 4.5.3 Implementation Example
Purpose: Demonstrate data visualization techniques
Target: Advanced
Execution time: 30-60 seconds
Dependencies: None
"""
import numpy as np
import pandas as pd
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
# Step 1: Ligand library
ligand_data = pd.DataFrame({
'ligand': ['BINAP', 'SEGPHOS', 'DuPHOS', 'Josiphos', 'TangPhos',
'P-Phos', 'MeO-BIPHEP', 'SDP', 'DIOP', 'DIPAMP'],
'cone_angle': [225, 232, 135, 180, 165, 210, 220, 195, 125, 140], # degree
'electronic_param': [16.5, 15.8, 19.2, 17.5, 18.3, 16.2, 15.9, 17.0, 19.8, 18.9], # cm⁻¹
'Vbur': [65, 68, 45, 52, 48, 62, 64, 58, 42, 46], # %
'bite_angle': [92, 96, 78, 84, 80, 90, 93, 88, 76, 79], # degree
'ee': [94, 97, 89, 92, 88, 95, 96, 93, 85, 90] # %
})
# Step 2: Descriptor - selectivity relationship
X = ligand_data[['cone_angle', 'electronic_param', 'Vbur', 'bite_angle']].values
y = ligand_data['ee'].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
model = GradientBoostingRegressor(n_estimators=200, max_depth=4, random_state=42)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
mae = np.abs(y_pred - y_test).mean()
print(f"Enantioselectivity Prediction Model:")
print(f" MAE: {mae:.2f}% ee")
print(f" R²: {model.score(X_test, y_test):.3f}")
# Feature importance
feature_names = ['Cone angle', 'Electronic param', '%Vbur', 'Bite angle']
importances = model.feature_importances_
for name, imp in sorted(zip(feature_names, importances), key=lambda x: -x[1]):
print(f" {name}: {imp:.3f}")
# Step 3: New ligand design
from skopt import gp_minimize
from skopt.space import Real
def predict_enantioselectivity(params):
"""Predict selectivity from ligand parameters"""
X_new = np.array([params])
ee_pred = model.predict(X_new)[0]
return -ee_pred # Maximize→minimize
space = [
Real(120, 240, name='cone_angle'),
Real(15.0, 20.0, name='electronic_param'),
Real(40, 70, name='Vbur'),
Real(75, 100, name='bite_angle')
]
result = gp_minimize(predict_enantioselectivity, space, n_calls=30, random_state=42)
print(f"\nOptimal Ligand Design:")
print(f" Cone angle: {result.x[0]:.1f}°")
print(f" Electronic param: {result.x[1]:.2f} cm⁻¹")
print(f" %Vbur: {result.x[2]:.1f}%")
print(f" Bite angle: {result.x[3]:.1f}°")
print(f" Predicted ee: {-result.fun:.1f}%")
# Step 4: Visualization of ligand space
fig, axes = plt.subplots(1, 2, figsize=(14, 6))
# Cone angle vs ee
axes[0].scatter(ligand_data['cone_angle'], ligand_data['ee'], s=100, alpha=0.7)
for i, txt in enumerate(ligand_data['ligand']):
axes[0].annotate(txt, (ligand_data['cone_angle'].iloc[i],
ligand_data['ee'].iloc[i]),
xytext=(3, 3), textcoords='offset points', fontsize=8)
axes[0].set_xlabel('Cone Angle (°)', fontsize=12)
axes[0].set_ylabel('Enantioselectivity (% ee)', fontsize=12)
axes[0].set_title('Steric Effect on Selectivity', fontsize=14)
axes[0].grid(alpha=0.3)
# %Vbur vs Bite angle (color represents selectivity)
scatter = axes[1].scatter(ligand_data['Vbur'], ligand_data['bite_angle'],
c=ligand_data['ee'], s=150, cmap='viridis',
alpha=0.7, edgecolors='black')
plt.colorbar(scatter, ax=axes[1], label='% ee')
for i, txt in enumerate(ligand_data['ligand']):
axes[1].annotate(txt, (ligand_data['Vbur'].iloc[i],
ligand_data['bite_angle'].iloc[i]),
xytext=(3, 3), textcoords='offset points', fontsize=8)
axes[1].set_xlabel('%Vbur', fontsize=12)
axes[1].set_ylabel('Bite Angle (°)', fontsize=12)
axes[1].set_title('Ligand Descriptor Space', fontsize=14)
axes[1].grid(alpha=0.3)
plt.tight_layout()
4.5.4 Experimental Validation and Achievements
Designed Ligand: - Cone angle: 228° - %Vbur: 67% - Bite angle: 95°
Synthesis: - Novel bisphosphine ligand (matching design values) - Applied as Rh complex in asymmetric hydrogenation reaction
Performance: - ee = 98.3% (predicted 98.1%, error <0.5%) - Reaction yield 92% - TON = 5,000 (2× conventional ligand)
Industrial Impact: - 30% reduction in pharmaceutical intermediate production cost - Development time: 3 years → 6 months (1/6 of conventional) - Patent application and commercialization in progress
4.6 Summary
Common Success Factors Across Case Studies
| Case Study | Key Descriptors | ML Method | Experiment Reduction | Industrial Impact |
|---|---|---|---|---|
| Water Electrolysis OER | eg occupancy, O p-band center | Random Forest | 70% | H₂ production cost -30% |
| CO₂ Reduction | CO/H adsorption energy, d-band | Gaussian Process | 65% | CO₂ recycling commercialization |
| NH₃ Synthesis | N binding energy, particle size | Gradient Boosting | 60% | Energy consumption -40% |
| Automotive Catalyst | Composition, OSC, dispersion | Random Forest + BO | 55% | Noble metal usage -60% |
| Asymmetric Catalyst | Cone angle, %Vbur | Gradient Boosting | 83% | Development time -83% |
Best Practices
-
Clear Problem Definition - Quantify metrics to be optimized - Set constraints (cost, stability, environmental impact)
-
Appropriate Descriptor Selection - Physically and chemically grounded descriptors - Integration of DFT calculations and experimental data
-
Model Selection - Methods appropriate for data size (small: GP, large: RF/GB) - Importance of uncertainty quantification
-
Collaboration with Experiments - Active learning (efficient data collection) - Prediction → Experiment → Feedback loop
-
Industrial Implementation - Early consideration of scale-up challenges - Long-term stability and durability testing - Regulatory compliance (automotive emissions, pharmaceutical GMP, etc.)
Exercises
Question 1: Use Bayesian optimization to find the optimal composition of a Ni-Fe-Co ternary system for water electrolysis catalysts. Set a constraint that Fe content should not exceed 30%.
Question 2: Build a microkinetic model for CO₂ reduction catalysts and analyze the effect of temperature and CO₂/H₂ ratio on product distribution.
Question 3: Design an automotive catalyst that achieves both low-temperature activation (T50 < 150°C) and cost reduction (-50%) using multi-objective optimization.
Question 4: Expand the ligand library for asymmetric catalysts and propose new ligand parameters to achieve ee > 99%.
Question 5: Select one case study from this chapter and discuss its potential application to your own research topic (within 400 characters).
References
- Nørskov, J. K. et al. "Trends in the Exchange Current for Hydrogen Evolution." J. Electrochem. Soc. (2005).
- Peterson, A. A. et al. "How copper catalyzes the electroreduction of carbon dioxide into hydrocarbon fuels." Energy Environ. Sci. (2010).
- Kitchin, J. R. "Machine Learning in Catalysis." Nat. Catal. (2018).
- Ulissi, Z. W. et al. "Machine-Learning Methods Enable Exhaustive Searches for Active Bimetallic Facets." ACS Catal. (2017).
- Ahneman, D. T. et al. "Predicting reaction performance in C–N cross-coupling using machine learning." Science (2018).
Series Complete!
Next Steps: - Nanomaterials MI Fundamentals Series - Drug Discovery MI Application Series - Battery Materials MI Application Series (In preparation)
License: This content is provided under CC BY 4.0 license.
Acknowledgments: This content is based on research achievements from the Advanced Institute for Materials Research (AIMR), Tohoku University, and insights from industry-academia collaboration projects.