Chapter 3: Catalyst MI Implementation Hands-On
This chapter covers Catalyst MI Implementation Hands. You will learn essential concepts and techniques.
Learning Objectives: - Catalyst structure manipulation and calculations using ASE - Building and evaluating catalyst activity prediction models - Catalyst composition exploration using Bayesian optimization - Integrated workflows with DFT calculations
Prerequisites: - Python basics (NumPy, Pandas, Matplotlib) - Machine learning fundamentals (scikit-learn) - Understanding of Chapters 1 and 2
Execution Environment:
pip install ase numpy pandas scikit-learn scikit-optimize matplotlib seaborn
3.1 ASE (Atomic Simulation Environment) Basics
Example 1: ASE Installation and Basic Operations
# Requirements:
# - Python 3.9+
# - numpy>=1.24.0, <2.0.0
"""
Example: Example 1: ASE Installation and Basic Operations
Purpose: Demonstrate neural network implementation
Target: Beginner to Intermediate
Execution time: ~5 seconds
Dependencies: None
"""
from ase import Atoms
from ase.visualize import view
import numpy as np
# Create metal surface (Pt(111) surface)
from ase.build import fcc111
slab = fcc111('Pt', size=(4, 4, 3), vacuum=10.0)
print(f"Number of atoms: {len(slab)}")
print(f"Cell size: {slab.get_cell()}")
print(f"Chemical formula: {slab.get_chemical_formula()}")
# Check atomic positions
positions = slab.get_positions()
print(f"Top layer Z coordinate: {positions[:, 2].max():.3f} Å")
Output:
Number of atoms: 48
Cell size: Cell([[11.122, 0.0, 0.0], [-5.561, 9.632, 0.0], [0.0, 0.0, 27.713]])
Chemical formula: Pt48
Top layer Z coordinate: 7.848 Å
Example 2: Creating Adsorption Structures
from ase.build import fcc111, add_adsorbate
from ase import Atoms
# Adsorb CO on Pt(111) surface
slab = fcc111('Pt', size=(3, 3, 3), vacuum=10.0)
co = Atoms('CO', positions=[(0, 0, 0), (0, 0, 1.15)])
# Adsorb at top site
add_adsorbate(slab, co, height=2.0, position='ontop')
print(f"Chemical formula after adsorption: {slab.get_chemical_formula()}")
print(f"Total number of atoms: {len(slab)}")
# Check CO molecule position
co_indices = [i for i, sym in enumerate(slab.get_chemical_symbols())
if sym in ['C', 'O']]
print(f"CO molecule indices: {co_indices}")
Example 3: Structure Optimization (Computational Chemistry)
from ase.build import fcc111, add_adsorbate
from ase.calculators.emt import EMT # Empirical potential
from ase.optimize import BFGS
# Adsorb H atom on Pt surface
slab = fcc111('Pt', size=(3, 3, 3), vacuum=10.0)
from ase import Atoms
h_atom = Atoms('H')
add_adsorbate(slab, h_atom, height=1.5, position='fcc')
# Calculator setup (EMT: fast but low accuracy)
slab.calc = EMT()
# Structure optimization
opt = BFGS(slab, trajectory='opt.traj')
opt.run(fmax=0.05) # Optimize until forces are below 0.05 eV/Å
# Results
final_energy = slab.get_potential_energy()
print(f"Optimized energy: {final_energy:.3f} eV")
Example 4: Adsorption Energy Calculation
from ase.build import fcc111, add_adsorbate
from ase.calculators.emt import EMT
from ase.optimize import BFGS
from ase import Atoms
def calculate_adsorption_energy(metal='Pt', adsorbate='H'):
"""Calculate adsorption energy"""
# Clean surface
slab_clean = fcc111(metal, size=(3, 3, 3), vacuum=10.0)
slab_clean.calc = EMT()
E_slab = slab_clean.get_potential_energy()
# Adsorbed system
slab_ads = fcc111(metal, size=(3, 3, 3), vacuum=10.0)
atom = Atoms(adsorbate)
add_adsorbate(slab_ads, atom, height=1.5, position='fcc')
slab_ads.calc = EMT()
opt = BFGS(slab_ads, logfile=None)
opt.run(fmax=0.05)
E_slab_ads = slab_ads.get_potential_energy()
# Gas phase molecule
molecule = Atoms(adsorbate)
molecule.center(vacuum=10.0)
molecule.calc = EMT()
E_mol = molecule.get_potential_energy()
# Adsorption energy
E_ads = E_slab_ads - E_slab - E_mol
return E_ads
# Comparison across multiple metals
metals = ['Pt', 'Pd', 'Ni', 'Cu']
for metal in metals:
E_ads = calculate_adsorption_energy(metal, 'H')
print(f"H adsorption energy on {metal} surface: {E_ads:.3f} eV")
Example Output:
H adsorption energy on Pt surface: -0.534 eV
H adsorption energy on Pd surface: -0.621 eV
H adsorption energy on Ni surface: -0.482 eV
H adsorption energy on Cu surface: -0.213 eV
Example 5: d-band Center Calculation
# Requirements:
# - Python 3.9+
# - numpy>=1.24.0, <2.0.0
import numpy as np
from ase.build import bulk
def calculate_d_band_center(metal, k_points=(8, 8, 8)):
"""Simplified d-band center calculation (actual DFT required)"""
# Approximation based on experimental values
d_band_centers = {
'Pt': -2.25, # eV (Fermi level reference)
'Pd': -1.83,
'Ni': -1.29,
'Cu': -2.67,
'Au': -3.56,
'Ag': -4.31
}
return d_band_centers.get(metal, None)
# d-band centers for multiple metals
metals = ['Cu', 'Ni', 'Pd', 'Pt', 'Au']
for metal in metals:
eps_d = calculate_d_band_center(metal)
print(f"{metal}: εd = {eps_d:.2f} eV")
Example 6: Creating Alloy Surfaces
# Requirements:
# - Python 3.9+
# - numpy>=1.24.0, <2.0.0
from ase.build import fcc111
import numpy as np
def create_alloy_surface(metal1='Pt', metal2='Ni', ratio=0.5, size=(4, 4, 3)):
"""Create alloy surface"""
slab = fcc111(metal1, size=size, vacuum=10.0)
# Randomly substitute metals
n_atoms = len(slab)
n_metal2 = int(n_atoms * ratio)
indices = np.random.choice(n_atoms, n_metal2, replace=False)
symbols = slab.get_chemical_symbols()
for idx in indices:
symbols[idx] = metal2
slab.set_chemical_symbols(symbols)
return slab
# Create PtNi alloy surface
alloy = create_alloy_surface('Pt', 'Ni', ratio=0.3, size=(5, 5, 3))
print(f"Composition: {alloy.get_chemical_formula()}")
# Check composition ratio
symbols = alloy.get_chemical_symbols()
pt_count = symbols.count('Pt')
ni_count = symbols.count('Ni')
print(f"Pt: {pt_count} atoms ({pt_count/(pt_count+ni_count)*100:.1f}%)")
print(f"Ni: {ni_count} atoms ({ni_count/(pt_count+ni_count)*100:.1f}%)")
Example 7: Coordination Number Calculation
# Requirements:
# - Python 3.9+
# - numpy>=1.24.0, <2.0.0
from ase.build import fcc111
from ase.neighborlist import NeighborList
import numpy as np
def calculate_coordination_numbers(atoms, cutoff=3.0):
"""Calculate coordination number for each atom"""
nl = NeighborList([cutoff/2]*len(atoms), self_interaction=False, bothways=True)
nl.update(atoms)
coord_numbers = []
for i in range(len(atoms)):
indices, offsets = nl.get_neighbors(i)
coord_numbers.append(len(indices))
return np.array(coord_numbers)
# Coordination number distribution for Pt(111) surface
slab = fcc111('Pt', size=(4, 4, 3), vacuum=10.0)
coord_nums = calculate_coordination_numbers(slab, cutoff=3.0)
print(f"Coordination number distribution:")
unique, counts = np.unique(coord_nums, return_counts=True)
for cn, count in zip(unique, counts):
print(f" CN={cn}: {count} atoms")
# Surface atoms (atoms with low coordination numbers)
surface_indices = np.where(coord_nums < 9)[0]
print(f"Number of surface atoms: {len(surface_indices)}")
3.2 Building Catalyst Activity Prediction Models
Example 8: Creating Descriptor Dataset
# Requirements:
# - Python 3.9+
# - numpy>=1.24.0, <2.0.0
# - pandas>=2.0.0, <2.2.0
"""
Example: Example 8: Creating Descriptor Dataset
Purpose: Demonstrate data manipulation and preprocessing
Target: Beginner to Intermediate
Execution time: ~5 seconds
Dependencies: None
"""
import pandas as pd
import numpy as np
# Metal catalyst descriptor data
data = {
'metal': ['Pt', 'Pd', 'Ni', 'Cu', 'Au', 'Ag', 'Rh', 'Ir', 'Fe', 'Co'],
'd_band_center': [-2.25, -1.83, -1.29, -2.67, -3.56, -4.31, -1.73, -2.12, -1.34, -1.41], # eV
'work_function': [5.65, 5.12, 5.15, 4.65, 5.1, 4.26, 4.98, 5.27, 4.5, 5.0], # eV
'surface_energy': [2.48, 2.00, 2.38, 1.79, 1.50, 1.25, 2.66, 3.05, 2.90, 2.52], # J/m²
'lattice_constant': [3.92, 3.89, 3.52, 3.61, 4.08, 4.09, 3.80, 3.84, 2.87, 3.54], # Å
'H_ads_energy': [-0.53, -0.62, -0.48, -0.21, -0.15, 0.12, -0.68, -0.71, -0.87, -0.74], # eV
'HER_activity': [8.2, 7.5, 6.8, 4.2, 3.8, 2.1, 7.9, 8.5, 5.3, 6.1] # log10(i0) (A/cm²)
}
df = pd.DataFrame(data)
print(df.head())
print(f"\nDataset shape: {df.shape}")
print(f"Descriptors: {df.columns.tolist()[1:-1]}")
Example 9: Data Preprocessing and Splitting
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
# Features and target
X = df[['d_band_center', 'work_function', 'surface_energy',
'lattice_constant', 'H_ads_energy']].values
y = df['HER_activity'].values
# Split into training and test data
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.3, random_state=42
)
# Standardization
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
print(f"Training data: {X_train.shape}")
print(f"Test data: {X_test.shape}")
print(f"\nBefore scaling: mean={X_train.mean(axis=0)}, std={X_train.std(axis=0)}")
print(f"After scaling: mean={X_train_scaled.mean(axis=0)}, std={X_train_scaled.std(axis=0)}")
Example 10: Linear Regression Model
# Requirements:
# - Python 3.9+
# - matplotlib>=3.7.0
"""
Example: Example 10: Linear Regression Model
Purpose: Demonstrate data visualization techniques
Target: Beginner to Intermediate
Execution time: 1-5 minutes
Dependencies: None
"""
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error, r2_score
import matplotlib.pyplot as plt
# Model training
model_lr = LinearRegression()
model_lr.fit(X_train_scaled, y_train)
# Prediction
y_pred_train = model_lr.predict(X_train_scaled)
y_pred_test = model_lr.predict(X_test_scaled)
# Evaluation
mae_train = mean_absolute_error(y_train, y_pred_train)
mae_test = mean_absolute_error(y_test, y_pred_test)
r2_train = r2_score(y_train, y_pred_train)
r2_test = r2_score(y_test, y_pred_test)
print(f"Training data: MAE={mae_train:.3f}, R²={r2_train:.3f}")
print(f"Test data: MAE={mae_test:.3f}, R²={r2_test:.3f}")
# Check coefficients
feature_names = ['d-band center', 'work function', 'surface energy',
'lattice constant', 'H_ads_energy']
for name, coef in zip(feature_names, model_lr.coef_):
print(f"{name}: {coef:.3f}")
Example 11: Random Forest Regression
from sklearn.ensemble import RandomForestRegressor
# Model training
model_rf = RandomForestRegressor(n_estimators=100, max_depth=5, random_state=42)
model_rf.fit(X_train_scaled, y_train)
# Prediction and evaluation
y_pred_test_rf = model_rf.predict(X_test_scaled)
mae_rf = mean_absolute_error(y_test, y_pred_test_rf)
r2_rf = r2_score(y_test, y_pred_test_rf)
print(f"Random Forest: MAE={mae_rf:.3f}, R²={r2_rf:.3f}")
# Feature importance
importances = model_rf.feature_importances_
for name, imp in zip(feature_names, importances):
print(f"{name}: {imp:.3f}")
Example 12: Cross-Validation
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import Ridge
# Ridge model with cross-validation
model_ridge = Ridge(alpha=1.0)
scores = cross_val_score(model_ridge, X_train_scaled, y_train,
cv=5, scoring='neg_mean_absolute_error')
print(f"5-fold CV MAE: {-scores.mean():.3f} ± {scores.std():.3f}")
print(f"Each fold: {-scores}")
# Hyperparameter tuning
alphas = [0.01, 0.1, 1.0, 10.0, 100.0]
for alpha in alphas:
model = Ridge(alpha=alpha)
scores = cross_val_score(model, X_train_scaled, y_train,
cv=5, scoring='neg_mean_absolute_error')
print(f"alpha={alpha:6.2f}: MAE={-scores.mean():.3f}")
Example 13: Gaussian Process Regression
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF, ConstantKernel
# Kernel definition
kernel = ConstantKernel(1.0) * RBF(length_scale=1.0)
# GPR model
model_gpr = GaussianProcessRegressor(kernel=kernel, n_restarts_optimizer=10,
random_state=42)
model_gpr.fit(X_train_scaled, y_train)
# Prediction (with uncertainty)
y_pred_gpr, y_std = model_gpr.predict(X_test_scaled, return_std=True)
mae_gpr = mean_absolute_error(y_test, y_pred_gpr)
r2_gpr = r2_score(y_test, y_pred_gpr)
print(f"GPR: MAE={mae_gpr:.3f}, R²={r2_gpr:.3f}")
print(f"\nPredictions with uncertainty:")
for true, pred, std in zip(y_test, y_pred_gpr, y_std):
print(f"True: {true:.2f}, Pred: {pred:.2f} ± {std:.2f}")
Example 14: Creating Volcano Plot
# Requirements:
# - Python 3.9+
# - matplotlib>=3.7.0
# - numpy>=1.24.0, <2.0.0
"""
Example: Example 14: Creating Volcano Plot
Purpose: Demonstrate data visualization techniques
Target: Beginner to Intermediate
Execution time: 2-5 seconds
Dependencies: None
"""
import matplotlib.pyplot as plt
import numpy as np
# Relationship between H adsorption energy and HER activity
fig, ax = plt.subplots(figsize=(8, 6))
ax.scatter(df['H_ads_energy'], df['HER_activity'], s=100, alpha=0.7)
# Display metal names
for i, txt in enumerate(df['metal']):
ax.annotate(txt, (df['H_ads_energy'].iloc[i], df['HER_activity'].iloc[i]),
xytext=(5, 5), textcoords='offset points')
# Volcano fit (2nd order polynomial)
x_fit = np.linspace(df['H_ads_energy'].min(), df['H_ads_energy'].max(), 100)
coeffs = np.polyfit(df['H_ads_energy'], df['HER_activity'], 2)
y_fit = np.polyval(coeffs, x_fit)
ax.plot(x_fit, y_fit, 'r--', label='Volcano fit')
ax.set_xlabel('H adsorption energy (eV)', fontsize=12)
ax.set_ylabel('log₁₀(HER activity)', fontsize=12)
ax.set_title('Volcano Plot for HER Activity', fontsize=14)
ax.legend()
ax.grid(alpha=0.3)
print(f"Optimal H adsorption energy: {-coeffs[1]/(2*coeffs[0]):.3f} eV")
Example 15: Saving and Loading Prediction Models
import pickle
# Save model
with open('catalyst_model.pkl', 'wb') as f:
pickle.dump({'model': model_rf, 'scaler': scaler}, f)
print("Model saved: catalyst_model.pkl")
# Load model
with open('catalyst_model.pkl', 'rb') as f:
loaded = pickle.load(f)
loaded_model = loaded['model']
loaded_scaler = loaded['scaler']
# Predict on new data
new_catalyst = np.array([[-2.0, 5.0, 2.0, 3.8, -0.4]]) # Virtual catalyst
new_catalyst_scaled = loaded_scaler.transform(new_catalyst)
prediction = loaded_model.predict(new_catalyst_scaled)
print(f"Predicted HER activity for new catalyst: {prediction[0]:.2f}")
3.3 Catalyst Composition Exploration Using Bayesian Optimization
Example 16: Bayesian Optimization Basics
# Requirements:
# - Python 3.9+
# - numpy>=1.24.0, <2.0.0
from skopt import gp_minimize
from skopt.space import Real
import numpy as np
# Objective function (catalyst activity simulation)
def catalyst_performance(x):
"""
x[0]: Pt ratio (0-1)
x[1]: Calcination temperature (300-800 K)
"""
pt_ratio, temp = x
# Virtual activity function (actual experiment or DFT calculation)
activity = -((pt_ratio - 0.6)**2 * 10 + (temp - 600)**2 / 10000)
noise = np.random.normal(0, 0.1) # Experimental noise
return -activity + noise # Convert to minimization problem
# Search space
space = [
Real(0.0, 1.0, name='pt_ratio'),
Real(300, 800, name='temperature')
]
# Execute Bayesian optimization
result = gp_minimize(
catalyst_performance,
space,
n_calls=20, # Number of experiments
random_state=42,
verbose=True
)
print(f"\nOptimal parameters:")
print(f" Pt ratio: {result.x[0]:.3f}")
print(f" Calcination temperature: {result.x[1]:.1f} K")
print(f" Maximum activity: {-result.fun:.3f}")
Example 17: Comparing Acquisition Functions
from skopt import gp_minimize
from skopt.space import Real
# Compare three acquisition functions
acq_funcs = ['EI', 'PI', 'LCB']
results = {}
for acq in acq_funcs:
result = gp_minimize(
catalyst_performance,
space,
n_calls=15,
acq_func=acq,
random_state=42,
verbose=False
)
results[acq] = result
print(f"{acq}: Optimal value = {-result.fun:.3f}, "
f"Pt ratio = {result.x[0]:.3f}, Temp = {result.x[1]:.1f}K")
Example 18: Multi-Objective Optimization (Activity vs Cost)
# Requirements:
# - Python 3.9+
# - numpy>=1.24.0, <2.0.0
from skopt import gp_minimize
import numpy as np
def multi_objective(x):
"""Activity and cost trade-off"""
pt_ratio = x[0]
temp = x[1]
# Activity (higher is better)
activity = -((pt_ratio - 0.6)**2 * 10 + (temp - 600)**2 / 10000)
# Cost (depends on Pt usage, lower is better)
cost = pt_ratio * 100 + (temp - 300) / 10
# Weighted sum (scalarization)
weight_activity = 0.7
weight_cost = 0.3
return -(weight_activity * activity - weight_cost * cost)
result_mo = gp_minimize(multi_objective, space, n_calls=25, random_state=42)
print(f"Optimal parameters (multi-objective):")
print(f" Pt ratio: {result_mo.x[0]:.3f}")
print(f" Temperature: {result_mo.x[1]:.1f} K")
Example 19: Visualizing Experiment History
# Requirements:
# - Python 3.9+
# - matplotlib>=3.7.0
"""
Example: Example 19: Visualizing Experiment History
Purpose: Demonstrate data visualization techniques
Target: Beginner to Intermediate
Execution time: 2-5 seconds
Dependencies: None
"""
from skopt.plots import plot_convergence, plot_evaluations
import matplotlib.pyplot as plt
# Convergence plot
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
plot_convergence(result, ax=axes[0])
axes[0].set_title('Convergence Plot')
axes[0].set_ylabel('Objective Value')
# Evaluation plot
plot_evaluations(result, dimensions=['pt_ratio', 'temperature'], ax=axes[1])
axes[1].set_title('Parameter Evaluation')
plt.tight_layout()
# Output experiment history
print("\nExperiment history:")
for i, (params, value) in enumerate(zip(result.x_iters, result.func_vals)):
print(f"Exp {i+1}: Pt={params[0]:.3f}, T={params[1]:.1f}K, "
f"Activity={-value:.3f}")
Example 20: Constrained Optimization
# Requirements:
# - Python 3.9+
# - numpy>=1.24.0, <2.0.0
from skopt import gp_minimize
import numpy as np
def constrained_objective(x):
"""Constrained catalyst optimization"""
pt_ratio = x[0]
ni_ratio = x[1]
# Constraint: Pt + Ni ≤ 0.8 (remainder is inexpensive support)
if pt_ratio + ni_ratio > 0.8:
return 1e6 # Penalty
# Activity prediction
activity = -(pt_ratio * 8 + ni_ratio * 5 -
(pt_ratio - 0.5)**2 * 10 - (ni_ratio - 0.2)**2 * 10)
return -activity
space_alloy = [
Real(0.0, 0.8, name='pt_ratio'),
Real(0.0, 0.8, name='ni_ratio')
]
result_const = gp_minimize(constrained_objective, space_alloy,
n_calls=30, random_state=42)
print(f"Optimal composition:")
print(f" Pt: {result_const.x[0]:.3f}")
print(f" Ni: {result_const.x[1]:.3f}")
print(f" Others: {1 - result_const.x[0] - result_const.x[1]:.3f}")
print(f" Predicted activity: {-result_const.fun:.3f}")
Example 21: Batch Bayesian Optimization
from skopt import gp_minimize
from skopt.optimizer import Optimizer
# Simulate batch experiments
optimizer = Optimizer(space, acq_func='EI', random_state=42)
# Initial samples
n_initial = 5
X_init = [[np.random.uniform(0, 1), np.random.uniform(300, 800)]
for _ in range(n_initial)]
y_init = [catalyst_performance(x) for x in X_init]
optimizer.tell(X_init, y_init)
# Batch experiments (3 experiments in parallel)
batch_size = 3
n_batches = 5
for batch in range(n_batches):
# Generate next batch of experimental candidates
X_next_batch = []
for _ in range(batch_size):
x_next = optimizer.ask()
X_next_batch.append(x_next)
# Execute experiments (in parallel)
y_next_batch = [catalyst_performance(x) for x in X_next_batch]
# Update results
optimizer.tell(X_next_batch, y_next_batch)
print(f"Batch {batch+1}: Best so far = {-min(optimizer.yi):.3f}")
print(f"\nFinal optimal value: {-min(optimizer.yi):.3f}")
print(f"Total experiments: {len(optimizer.Xi)}")
Example 22: Transfer Learning Bayesian Optimization
# Requirements:
# - Python 3.9+
# - numpy>=1.24.0, <2.0.0
from skopt import gp_minimize
from sklearn.gaussian_process.kernels import RBF, WhiteKernel
import numpy as np
# Past data from similar catalyst system (transfer learning source)
X_source = np.array([[0.3, 400], [0.5, 500], [0.7, 600], [0.9, 700]])
y_source = np.array([-2.5, -4.0, -4.8, -3.5]) # Activity data
# New catalyst system optimization (target)
def target_catalyst(x):
"""New catalyst system (similar but different)"""
pt_ratio, temp = x
activity = -((pt_ratio - 0.55)**2 * 12 + (temp - 550)**2 / 8000)
return -activity + np.random.normal(0, 0.1)
# Pre-train GPR model with past data
from sklearn.gaussian_process import GaussianProcessRegressor
kernel = RBF(length_scale=1.0) + WhiteKernel(noise_level=0.1)
gpr_prior = GaussianProcessRegressor(kernel=kernel)
gpr_prior.fit(X_source, y_source)
# Bayesian optimization (utilizing prior knowledge)
result_tl = gp_minimize(
target_catalyst,
space,
n_calls=10, # Few experiments
random_state=42
)
print(f"With transfer learning: Optimal value = {-result_tl.fun:.3f} (10 experiments)")
# Comparison: Without transfer learning
result_no_tl = gp_minimize(target_catalyst, space, n_calls=10, random_state=42)
print(f"Without transfer learning: Optimal value = {-result_no_tl.fun:.3f} (10 experiments)")
3.4 Integration with DFT Calculations
Example 23: DFT Calculation Setup with ASE
from ase.build import fcc111, add_adsorbate
from ase import Atoms
# Actual DFT calculations require GPAW, VASP, Quantum ESPRESSO, etc.
# This shows setup example
# CO adsorption on Pt(111) surface
slab = fcc111('Pt', size=(3, 3, 4), vacuum=15.0)
co = Atoms('CO', positions=[(0, 0, 0), (0, 0, 1.15)])
add_adsorbate(slab, co, height=2.0, position='ontop')
# DFT calculator setup (GPAW case)
# from gpaw import GPAW, PW
# calc = GPAW(
# mode=PW(500), # Plane wave cutoff
# xc='PBE',
# kpts=(4, 4, 1),
# txt='co_pt.txt'
# )
# slab.calc = calc
print("DFT calculation setup complete (requires GPAW, etc.)")
print(f"System size: {len(slab)} atoms")
print(f"Cell: {slab.get_cell()}")
Example 24: High-Accuracy Adsorption Energy Calculation
# Requirements:
# - Python 3.9+
# - numpy>=1.24.0, <2.0.0
from ase.build import fcc111, add_adsorbate
from ase.calculators.emt import EMT # EMT instead of DFT
from ase.optimize import BFGS
import numpy as np
def dft_adsorption_energy(metal, adsorbate, site='fcc'):
"""DFT-equivalent adsorption energy calculation"""
# Clean surface
slab = fcc111(metal, size=(3, 3, 4), vacuum=10.0)
slab.calc = EMT()
E_slab = slab.get_potential_energy()
# Adsorbed system
slab_ads = slab.copy()
if adsorbate == 'H':
from ase import Atoms
atom = Atoms('H')
add_adsorbate(slab_ads, atom, height=1.5, position=site)
slab_ads.calc = EMT()
opt = BFGS(slab_ads, logfile=None)
opt.run(fmax=0.01) # High-accuracy optimization
E_slab_ads = slab_ads.get_potential_energy()
# Gas phase H2
from ase import Atoms
h2 = Atoms('H2', positions=[(0, 0, 0), (0, 0, 0.74)])
h2.center(vacuum=10.0)
h2.calc = EMT()
E_h2 = h2.get_potential_energy()
E_ads = E_slab_ads - E_slab - 0.5 * E_h2
return E_ads
# Adsorption energy at multiple sites
sites = ['fcc', 'hcp', 'ontop']
for site in sites:
E_ads = dft_adsorption_energy('Pt', 'H', site)
print(f"Pt(111) {site} site H adsorption: {E_ads:.3f} eV")
Example 25: Reaction Pathway Analysis (NEB Method)
# Requirements:
# - Python 3.9+
# - numpy>=1.24.0, <2.0.0
from ase.build import fcc111
from ase.calculators.emt import EMT
from ase.neb import NEB
from ase.optimize import BFGS
import numpy as np
# Initial and final states (simplified example)
def setup_reaction_path():
"""H atom surface diffusion (fcc → hcp)"""
# Initial state: H at fcc site
slab_initial = fcc111('Pt', size=(3, 3, 3), vacuum=10.0)
from ase import Atoms
from ase.build import add_adsorbate
h_atom = Atoms('H')
add_adsorbate(slab_initial, h_atom, height=1.5, position='fcc')
slab_initial.calc = EMT()
# Final state: H at hcp site
slab_final = fcc111('Pt', size=(3, 3, 3), vacuum=10.0)
h_atom = Atoms('H')
add_adsorbate(slab_final, h_atom, height=1.5, position='hcp')
slab_final.calc = EMT()
return slab_initial, slab_final
initial, final = setup_reaction_path()
# NEB calculation (simplified version)
print("Transition state search using NEB method")
print(f"Initial state energy: {initial.get_potential_energy():.3f} eV")
print(f"Final state energy: {final.get_potential_energy():.3f} eV")
print("Actual NEB calculation requires multiple images and optimization")
Example 26: Electronic Structure Analysis
# Requirements:
# - Python 3.9+
# - matplotlib>=3.7.0
# - numpy>=1.24.0, <2.0.0
import numpy as np
import matplotlib.pyplot as plt
# Simulate DOS (Density of States) from DFT calculations
def simulate_dos(metal):
"""Simulate metal DOS"""
energies = np.linspace(-10, 5, 500)
if metal == 'Pt':
# d-band: -6 to -2 eV
dos = np.exp(-((energies + 2.25)**2) / 2) * 3
# sp-band
dos += np.exp(-((energies - 0)**2) / 10) * 0.5
elif metal == 'Cu':
# d-band: -4 to -1 eV (deeper)
dos = np.exp(-((energies + 2.67)**2) / 2) * 3
dos += np.exp(-((energies - 0)**2) / 10) * 0.5
return energies, dos
# Plot
fig, ax = plt.subplots(figsize=(10, 6))
for metal in ['Pt', 'Cu']:
energies, dos = simulate_dos(metal)
ax.plot(energies, dos, label=metal, linewidth=2)
ax.axvline(0, color='k', linestyle='--', label='Fermi level')
ax.set_xlabel('Energy (eV)', fontsize=12)
ax.set_ylabel('Density of States', fontsize=12)
ax.set_title('Electronic DOS of Metal Catalysts', fontsize=14)
ax.legend()
ax.grid(alpha=0.3)
print("DOS plot created")
Example 27: DFT-ML Integration Workflow
# Requirements:
# - Python 3.9+
# - matplotlib>=3.7.0
# - numpy>=1.24.0, <2.0.0
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF
import numpy as np
# Step 1: Small number of DFT calculations
def expensive_dft_calculation(composition):
"""High-cost DFT calculation (simulation)"""
x = composition
# Actual adsorption energy calculation
energy = -2.0 + 3.0 * x - 2.0 * x**2 + np.random.normal(0, 0.05)
return energy
# Initial DFT calculations (5 points)
X_dft = np.array([[0.2], [0.4], [0.6], [0.8], [1.0]])
y_dft = np.array([expensive_dft_calculation(x[0]) for x in X_dft])
# Step 2: GPR surrogate model
gpr = GaussianProcessRegressor(kernel=RBF(), n_restarts_optimizer=10)
gpr.fit(X_dft, y_dft)
# Step 3: Prediction at many points (low cost)
X_pred = np.linspace(0, 1, 100).reshape(-1, 1)
y_pred, y_std = gpr.predict(X_pred, return_std=True)
print("DFT-ML integration workflow:")
print(f"Number of DFT calculations: {len(X_dft)}")
print(f"Number of ML predictions: {len(X_pred)}")
print(f"Optimal composition (predicted): {X_pred[np.argmin(y_pred)][0]:.3f}")
# Plot
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(10, 6))
ax.scatter(X_dft, y_dft, c='red', s=100, label='DFT calc', zorder=10)
ax.plot(X_pred, y_pred, 'b-', label='GPR mean')
ax.fill_between(X_pred.ravel(), y_pred - y_std, y_pred + y_std,
alpha=0.3, label='±1 std')
ax.set_xlabel('Composition')
ax.set_ylabel('Adsorption Energy (eV)')
ax.legend()
3.5 Reaction Kinetics Analysis
Example 28: Arrhenius Analysis
# Requirements:
# - Python 3.9+
# - matplotlib>=3.7.0
# - numpy>=1.24.0, <2.0.0
"""
Example: Example 28: Arrhenius Analysis
Purpose: Demonstrate data visualization techniques
Target: Intermediate
Execution time: 10-30 seconds
Dependencies: None
"""
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
# Experimental data (temperature vs reaction rate)
temperatures = np.array([300, 350, 400, 450, 500, 550]) # K
rate_constants = np.array([0.01, 0.05, 0.15, 0.35, 0.70, 1.20]) # s⁻¹
# Arrhenius equation: k = A * exp(-Ea / (R*T))
def arrhenius(T, A, Ea):
R = 8.314e-3 # kJ/(mol·K)
return A * np.exp(-Ea / (R * T))
# Fitting
popt, pcov = curve_fit(arrhenius, temperatures, rate_constants, p0=[1e10, 50])
A_fit, Ea_fit = popt
print(f"Frequency factor A: {A_fit:.2e} s⁻¹")
print(f"Activation energy Ea: {Ea_fit:.2f} kJ/mol")
# Arrhenius plot
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
# Normal plot
axes[0].scatter(temperatures, rate_constants, label='Experimental', s=100)
T_fit = np.linspace(280, 570, 100)
axes[0].plot(T_fit, arrhenius(T_fit, A_fit, Ea_fit), 'r--', label='Fit')
axes[0].set_xlabel('Temperature (K)')
axes[0].set_ylabel('Rate constant (s⁻¹)')
axes[0].legend()
# Arrhenius plot (linearized)
axes[1].scatter(1000/temperatures, np.log(rate_constants), s=100)
axes[1].plot(1000/T_fit, np.log(arrhenius(T_fit, A_fit, Ea_fit)), 'r--')
axes[1].set_xlabel('1000/T (K⁻¹)')
axes[1].set_ylabel('ln(k)')
axes[1].set_title(f'Ea = {Ea_fit:.1f} kJ/mol')
Example 29: Reaction Order Analysis
# Requirements:
# - Python 3.9+
# - matplotlib>=3.7.0
# - numpy>=1.24.0, <2.0.0
import numpy as np
from scipy.integrate import odeint
import matplotlib.pyplot as plt
# Reaction rate equation: -dC/dt = k * C^n
def reaction_rate(C, t, k, n):
"""n-th order reaction rate equation"""
return -k * C**n
# Parameters
k = 0.1 # Rate constant
C0 = 1.0 # Initial concentration
t = np.linspace(0, 50, 100)
# Calculate for different reaction orders
fig, ax = plt.subplots(figsize=(10, 6))
for n in [0, 1, 2]:
C = odeint(reaction_rate, C0, t, args=(k, n))
ax.plot(t, C, label=f'n={n} order', linewidth=2)
ax.set_xlabel('Time', fontsize=12)
ax.set_ylabel('Concentration', fontsize=12)
ax.set_title('Reaction Order Effects', fontsize=14)
ax.legend()
ax.grid(alpha=0.3)
# Half-life calculation
def half_life(k, n, C0=1.0):
"""Half-life for n-th order reaction"""
if n == 0:
return C0 / (2 * k)
elif n == 1:
return np.log(2) / k
elif n == 2:
return 1 / (k * C0)
for n in [0, 1, 2]:
t_half = half_life(k, n, C0)
print(f"Half-life for {n}-th order reaction: {t_half:.2f}")
Example 30: Microkinetic Model
# Requirements:
# - Python 3.9+
# - matplotlib>=3.7.0
# - numpy>=1.24.0, <2.0.0
import numpy as np
from scipy.integrate import odeint
import matplotlib.pyplot as plt
# Microkinetic model for CO oxidation reaction
# CO(g) + O(g) -> CO2(g) on Pt surface
def microkinetics(y, t, k1, k2, k3, k4):
"""
y: [θ_CO, θ_O, θ_free] # Surface coverage
k1: CO adsorption rate constant
k2: O2 dissociative adsorption rate constant
k3: CO oxidation rate constant
k4: CO2 desorption rate constant
"""
theta_CO, theta_O, theta_free = y
# Gas phase partial pressures (constant)
P_CO = 0.1 # bar
P_O2 = 0.2 # bar
# Reaction rates
r1 = k1 * P_CO * theta_free # CO adsorption
r2 = k2 * P_O2 * theta_free**2 # O2 dissociative adsorption
r3 = k3 * theta_CO * theta_O # CO + O -> CO2
r4 = k4 * theta_CO * theta_O # CO2 desorption (simultaneous with reaction)
# Coverage changes
dtheta_CO_dt = r1 - r3
dtheta_O_dt = 2 * r2 - r3
dtheta_free_dt = -r1 - 2 * r2 + r3
return [dtheta_CO_dt, dtheta_O_dt, dtheta_free_dt]
# Initial conditions
y0 = [0.0, 0.0, 1.0] # Clean surface
# Rate constants (arbitrary units)
k1, k2, k3, k4 = 1.0, 0.5, 2.0, 2.0
# Time evolution
t = np.linspace(0, 10, 1000)
solution = odeint(microkinetics, y0, t, args=(k1, k2, k3, k4))
# Plot
fig, ax = plt.subplots(figsize=(10, 6))
ax.plot(t, solution[:, 0], label='θ_CO', linewidth=2)
ax.plot(t, solution[:, 1], label='θ_O', linewidth=2)
ax.plot(t, solution[:, 2], label='θ_free', linewidth=2)
ax.set_xlabel('Time', fontsize=12)
ax.set_ylabel('Surface Coverage', fontsize=12)
ax.set_title('Microkinetic Model: CO Oxidation on Pt', fontsize=14)
ax.legend()
ax.grid(alpha=0.3)
# Steady state
theta_CO_ss = solution[-1, 0]
theta_O_ss = solution[-1, 1]
TOF = k3 * theta_CO_ss * theta_O_ss # Turnover Frequency
print(f"Steady state:")
print(f" θ_CO: {theta_CO_ss:.3f}")
print(f" θ_O: {theta_O_ss:.3f}")
print(f" TOF: {TOF:.3f} s⁻¹")
3.6 Project Challenge
Challenge: CO2 Reduction Catalyst Optimization
Optimize a CO2→CO conversion catalyst following these steps:
- Data Collection: Calculate CO2 adsorption energies for multiple metals (Cu, Ag, Au, Pd, Pt)
- Descriptor Calculation: Obtain d-band center, work function, etc.
- Build Prediction Model: GPR model to predict activity from descriptors
- Bayesian Optimization: Optimize alloy composition (Cu-Ag binary system)
- Validation: DFT calculation (EMT approximation) at optimal composition
Evaluation Criteria: - CO production selectivity > 80% - Overpotential < 0.5 V - Minimize cost (Ag usage)
Deliverables: - Optimal composition (Cu:Ag ratio) - Trade-off analysis of predicted activity and cost - Complete Python code
Exercise Problems
Problem 1: Create a Pt-Ni alloy catalyst (Pt:Ni = 3:1) (111) surface and calculate the coordination number distribution.
Problem 2: Based on the Sabatier principle from Chapter 2, build a machine learning model to predict optimal H adsorption energy (refer to Examples 8-14).
Problem 3: Simultaneously optimize calcination temperature (400-900 K) and loading amount (0-50 wt%) using Bayesian optimization to achieve maximum activity.
Problem 4: Calculate the activation energy from your experimental data using an Arrhenius plot.
Problem 5: Simulate steady-state coverage and TOF when varying CO/O2 ratio in the microkinetic model.
References
- ASE Documentation: https://wiki.fysik.dtu.dk/ase/
- scikit-optimize: https://scikit-optimize.github.io/
- Nørskov, J. K. et al. "Origin of the Overpotential for Oxygen Reduction at a Fuel-Cell Cathode." J. Phys. Chem. B (2004).
- Hammer, B. & Nørskov, J. K. "Theoretical Surface Science and Catalysis." Advances in Catalysis (2000).
Next Chapter: Chapter 4: Catalyst MI Practical Case Studies
License: This content is provided under CC BY 4.0 license.