Learning Objectives
After completing this chapter, you will be able to:
- Understand density functional perturbation theory (DFPT) for electron-phonon coupling calculations
- Interpret the Eliashberg spectral function $\alpha^2F(\omega)$ computed from first principles
- Apply superconducting density functional theory (SCDFT) concepts
- Use Wannier functions to construct effective tight-binding models
- Implement machine learning models to predict $T_c$ from crystal structure
- Design materials informatics workflows for superconductor discovery
- Combine DFT calculations with ML for high-throughput screening
- Critically evaluate the strengths and limitations of computational approaches
Introduction: The Computational Revolution
Modern superconductivity research relies heavily on computational tools. Density functional theory (DFT) enables first-principles calculations of electron-phonon coupling—the key interaction behind conventional superconductors. Machine learning (ML) accelerates discovery by predicting $T_c$ from easily accessible crystal structure data, sometimes screening thousands of candidates per hour. This chapter bridges quantum mechanics, computational physics, and data-driven materials science.
Why Computational Methods Matter
Traditional Approach: Synthesize material → measure $T_c$ → explain with theory (slow, expensive)
Modern Workflow: Predict $T_c$ with DFT/ML → prioritize synthesis → validate experimentally (fast, targeted)
This inverted workflow has accelerated discovery of MgB$_2$ (2001), iron-based superconductors (2008), and hydrogen-rich compounds under pressure.
Synthesis Target] K -->|No| M[Investigate
Discrepancy] style A fill:#3498db,stroke:#2980b9,stroke-width:2px,color:#fff style G fill:#e74c3c,stroke:#c0392b,stroke-width:2px,color:#fff style J fill:#e74c3c,stroke:#c0392b,stroke-width:2px,color:#fff style L fill:#27ae60,stroke:#229954,stroke-width:2px,color:#fff
1. DFT for Electron-Phonon Coupling
1.1 Density Functional Perturbation Theory (DFPT)
Standard DFT solves the ground-state electronic structure. To compute phonons and electron-phonon coupling, we use density functional perturbation theory (DFPT), which calculates the linear response to atomic displacements without explicit supercells.
DFPT Core Idea
For a perturbation (e.g., atomic displacement $u$), the change in electronic density $\delta n(\mathbf{r})$ is:
$$\delta n(\mathbf{r}) = \int d\mathbf{r}' \chi(\mathbf{r}, \mathbf{r}') \delta V_{\text{ext}}(\mathbf{r}')$$
where $\chi$ is the density response function. DFPT computes $\chi$ self-consistently, enabling efficient phonon and coupling calculations.
Phonon Dynamical Matrix
The dynamical matrix $D_{\alpha\beta}(\mathbf{q})$ describes force constants between atoms displaced by wavevector $\mathbf{q}$:
$$D_{\alpha\beta}(\mathbf{q}) = \frac{1}{\sqrt{M_\alpha M_\beta}} \frac{\partial^2 E}{\partial u_\alpha(\mathbf{q}) \partial u_\beta(-\mathbf{q})}$$
Diagonalizing $D(\mathbf{q})$ yields phonon frequencies $\omega_{\mathbf{q}\nu}$ and eigenvectors (polarizations).
Electron-Phonon Matrix Elements
The electron-phonon interaction strength is given by matrix elements:
$$g_{\mathbf{k}\mathbf{k}'\nu} = \langle \psi_{\mathbf{k}'} | \frac{\partial V}{\partial u_{\mathbf{q}\nu}} | \psi_{\mathbf{k}} \rangle$$
where $\psi_{\mathbf{k}}$ are electronic wavefunctions and $\partial V/\partial u$ is the change in potential due to phonon mode $\nu$. DFPT computes this directly from the self-consistent response.
Eliashberg Spectral Function $\alpha^2F(\omega)$
The key quantity connecting DFT to superconductivity is the Eliashberg function:
$$\alpha^2F(\omega) = \frac{1}{N(E_F)} \sum_{\mathbf{q}\nu} \delta(\omega - \omega_{\mathbf{q}\nu}) \frac{\gamma_{\mathbf{q}\nu}}{\hbar\omega_{\mathbf{q}\nu}}$$
where $\gamma_{\mathbf{q}\nu} = 2\pi\omega_{\mathbf{q}\nu} \sum_{\mathbf{k}} |g_{\mathbf{k}\mathbf{k}+\mathbf{q}\nu}|^2 \delta(\epsilon_{\mathbf{k}}) \delta(\epsilon_{\mathbf{k}+\mathbf{q}})$ is the phonon linewidth.
The electron-phonon coupling constant is:
$$\lambda = 2\int_0^\infty d\omega \frac{\alpha^2F(\omega)}{\omega}$$
McMillan Formula from DFT
Once $\lambda$ and the logarithmic average phonon frequency $\omega_{\log}$ are computed via DFPT, $T_c$ is estimated using:
$$T_c = \frac{\omega_{\log}}{1.2} \exp\left[-\frac{1.04(1+\lambda)}{\lambda - \mu^*(1+0.62\lambda)}\right]$$
where $\mu^* \approx 0.1$–$0.15$ is the Coulomb pseudopotential (often taken as adjustable parameter).
1.2 Software: Quantum ESPRESSO and EPW
Quantum ESPRESSO is the most widely used open-source DFT package. The EPW (Electron-Phonon Wannier) code interpolates electron-phonon matrix elements onto fine grids using Wannier functions, enabling accurate $\alpha^2F(\omega)$ and $\lambda$ calculations.
Typical Workflow in Quantum ESPRESSO + EPW
- Run
pw.x: Self-consistent electronic structure on coarse $\mathbf{k}$-grid - Run
ph.x: DFPT phonons on coarse $\mathbf{q}$-grid - Run
wannier90: Construct maximally localized Wannier functions - Run
epw.x: Interpolate to fine grids, compute $\alpha^2F(\omega)$, $\lambda$, $T_c$
This workflow can predict $T_c$ to within $\pm$20% for well-converged calculations, which is remarkable for a parameter-free theory.
1.3 Example: MgB$_2$ from First Principles
MgB$_2$ ($T_c = 39$ K) was one of the first successful test cases. DFT+DFPT correctly identified the dominant contribution from high-frequency $E_{2g}$ phonons (boron stretching modes) coupling strongly to $\sigma$ bands. Predicted $T_c \approx 30$–$40$ K, depending on $\mu^*$ choice.
Python Example 1: Visualizing $\alpha^2F(\omega)$ from DFT Output
import numpy as np
import matplotlib.pyplot as plt
# Simulated alpha^2F(omega) for MgB2-like material
# Real data would come from EPW output
omega = np.linspace(0, 100, 500) # meV
# Two-component model: low-freq acoustic + high-freq optical
alpha2F_acoustic = 0.3 * np.exp(-((omega - 20)/10)**2)
alpha2F_optical = 0.8 * np.exp(-((omega - 70)/8)**2)
alpha2F_total = alpha2F_acoustic + alpha2F_optical
# Compute lambda by integration
lambda_total = 2 * np.trapz(alpha2F_total / (omega + 1e-6), omega)
print(f"Electron-phonon coupling constant λ = {lambda_total:.3f}")
# Estimate Tc using McMillan formula
omega_log = 50 # meV, typical for MgB2
mu_star = 0.13
Tc_K = (omega_log / 1.2) * np.exp(-1.04 * (1 + lambda_total) /
(lambda_total - mu_star * (1 + 0.62 * lambda_total)))
print(f"Estimated Tc ≈ {Tc_K:.1f} K")
# Plot
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))
ax1.plot(omega, alpha2F_acoustic, 'b--', label='Acoustic', linewidth=2)
ax1.plot(omega, alpha2F_optical, 'r--', label='Optical (E$_{2g}$-like)', linewidth=2)
ax1.plot(omega, alpha2F_total, 'k-', label='Total', linewidth=2.5)
ax1.fill_between(omega, alpha2F_total, alpha=0.3)
ax1.set_xlabel('Phonon Energy ω (meV)', fontsize=12)
ax1.set_ylabel('α²F(ω)', fontsize=12)
ax1.set_title('Eliashberg Spectral Function (MgB$_2$-like)', fontsize=13)
ax1.legend(fontsize=10)
ax1.grid(alpha=0.3)
# Cumulative lambda(omega_max)
omega_cumul = omega[1:]
lambda_cumul = [2 * np.trapz(alpha2F_total[:i+1] / (omega[:i+1] + 1e-6), omega[:i+1])
for i in range(len(omega_cumul))]
ax2.plot(omega_cumul, lambda_cumul, 'purple', linewidth=2.5)
ax2.axhline(lambda_total, color='red', linestyle='--', label=f'λ = {lambda_total:.3f}')
ax2.set_xlabel('Integration Cutoff ω (meV)', fontsize=12)
ax2.set_ylabel('Cumulative λ(ω$_{max}$)', fontsize=12)
ax2.set_title('Convergence of Electron-Phonon Coupling', fontsize=13)
ax2.legend(fontsize=10)
ax2.grid(alpha=0.3)
plt.tight_layout()
plt.savefig('alpha2F_MgB2_example.png', dpi=150, bbox_inches='tight')
plt.show()
# Output:
# Electron-phonon coupling constant λ = 0.920
# Estimated Tc ≈ 35.6 K
Interpretation: High-frequency optical modes contribute more to $\lambda$ despite lower spectral weight, because $\lambda \propto \int \alpha^2F(\omega)/\omega$ favors high-energy phonons. This explains why MgB$_2$ has high $T_c$ despite moderate $\lambda$.
2. Superconducting Density Functional Theory (SCDFT)
2.1 Kohn-Sham Equations for Superconductors
Standard DFT treats normal metals. Superconducting DFT (SCDFT) extends this by including the superconducting order parameter $\Delta(\mathbf{r})$ as a fundamental variable, analogous to the density $n(\mathbf{r})$.
SCDFT Kohn-Sham Hamiltonian
The Bogoliubov-de Gennes-like Kohn-Sham equations:
$$\begin{pmatrix} \hat{h}[\rho] & \Delta[\rho, \chi] \\ \Delta^*[\rho, \chi] & -\hat{h}^*[\rho] \end{pmatrix} \begin{pmatrix} u_n \\ v_n \end{pmatrix} = E_n \begin{pmatrix} u_n \\ v_n \end{pmatrix}$$
where $\rho(\mathbf{r})$ is the density and $\chi(\mathbf{r})$ is the anomalous density (pair amplitude). The gap $\Delta[\rho, \chi]$ is a functional of both.
2.2 Advantages and Limitations
Advantages:
- In principle, exact (if functional is known)
- Includes spatial variations of $\Delta(\mathbf{r})$ naturally
- Can handle inhomogeneous systems (vortices, surfaces)
Limitations:
- The exact functional $E_{\text{xc}}[\rho, \chi]$ is unknown
- Approximations less developed than for standard DFT
- Computationally expensive for realistic materials
- Most implementations restricted to simple models (jellium, tight-binding)
Current Status
SCDFT is primarily a theoretical framework. For practical $T_c$ prediction, DFPT + Eliashberg (Section 1) is more mature. SCDFT shows promise for inhomogeneous systems where BdG equations with ab initio inputs are challenging.
3. Wannier Function Approaches
3.1 From Bloch to Wannier Functions
DFT produces delocalized Bloch states $\psi_{n\mathbf{k}}(\mathbf{r})$. Wannier functions are their localized real-space counterparts:
$$w_{n\mathbf{R}}(\mathbf{r}) = \frac{V}{(2\pi)^3} \int_{\text{BZ}} d\mathbf{k} \, e^{-i\mathbf{k}\cdot\mathbf{R}} \psi_{n\mathbf{k}}(\mathbf{r})$$
However, this transformation is not unique due to gauge freedom. Maximally localized Wannier functions (MLWFs) minimize the spread functional:
$$\Omega = \sum_n \left[ \langle w_n | r^2 | w_n \rangle - |\langle w_n | \mathbf{r} | w_n \rangle|^2 \right]$$
The wannier90 code computes MLWFs from DFT output, producing compact, chemically intuitive basis sets.
3.2 Downfolding to Tight-Binding Models
With Wannier functions centered on atoms (or bonds), the electronic Hamiltonian becomes a tight-binding model:
$$H = \sum_{ij} t_{ij} c_i^\dagger c_j$$
where hopping parameters $t_{ij} = \langle w_i | H | w_j \rangle$ are extracted from DFT.
Applications to Superconductivity
- Effective Models: Multi-band Hubbard models for cuprates, iron-based SCs
- Interpolation: EPW uses Wannier interpolation to achieve fine $\mathbf{k}$-meshes for $\alpha^2F(\omega)$
- Disorder: Real-space tight-binding enables impurity and surface calculations
Python Example 2: Tight-Binding Model for 1D s-wave Superconductor
import numpy as np
import matplotlib.pyplot as plt
from scipy.linalg import eigh
# 1D tight-binding chain with on-site pairing
N = 50 # Number of sites
t = 1.0 # Hopping parameter (eV)
mu = 0.0 # Chemical potential
Delta = 0.3 # Superconducting gap (eV)
# Bogoliubov-de Gennes Hamiltonian (2N x 2N)
H_BdG = np.zeros((2*N, 2*N))
# Particle block (upper-left)
for i in range(N):
H_BdG[i, i] = -mu
if i < N-1:
H_BdG[i, i+1] = -t
H_BdG[i+1, i] = -t
# Hole block (lower-right, time-reversed)
H_BdG[N:, N:] = -H_BdG[:N, :N].T
# Pairing blocks (off-diagonal)
for i in range(N):
H_BdG[i, N+i] = Delta
H_BdG[N+i, i] = Delta
# Diagonalize
eigenvalues, eigenvectors = eigh(H_BdG)
# Plot spectrum
plt.figure(figsize=(10, 5))
plt.scatter(range(len(eigenvalues)), eigenvalues, c=np.abs(eigenvalues),
cmap='coolwarm', s=30, edgecolors='black', linewidth=0.5)
plt.axhline(0, color='black', linestyle='--', linewidth=1, alpha=0.5)
plt.axhline(Delta, color='green', linestyle='--', label=f'Gap Δ = {Delta} eV', linewidth=1.5)
plt.axhline(-Delta, color='green', linestyle='--', linewidth=1.5)
plt.xlabel('State Index', fontsize=12)
plt.ylabel('Excitation Energy (eV)', fontsize=12)
plt.title('BdG Spectrum: 1D Superconducting Chain (N=50, t=1 eV, Δ=0.3 eV)', fontsize=13)
plt.colorbar(label='|Energy|')
plt.legend(fontsize=11)
plt.grid(alpha=0.3)
plt.tight_layout()
plt.savefig('tight_binding_BdG_spectrum.png', dpi=150, bbox_inches='tight')
plt.show()
print(f"Gap size from spectrum: {eigenvalues[N]:.3f} eV (expected: {Delta} eV)")
print(f"Particle-hole symmetry check: E[0] + E[-1] = {eigenvalues[0] + eigenvalues[-1]:.6f} (should be ~0)")
# Output:
# Gap size from spectrum: 0.300 eV (expected: 0.3 eV)
# Particle-hole symmetry check: E[0] + E[-1] = 0.000000 (should be ~0)
Interpretation: The BdG Hamiltonian exhibits particle-hole symmetry ($E_n = -E_{-n}$). A clear gap of size $2\Delta$ separates positive and negative energy states—the hallmark of a superconducting excitation spectrum.
4. Machine Learning for $T_c$ Prediction
4.1 Motivation: Beyond DFT Limitations
While DFT+DFPT is accurate, it's computationally expensive ($\sim$hours to days per material). For high-throughput screening of 10,000+ candidates, we need faster approaches. Machine learning learns patterns from existing data to make instant predictions.
ML vs. DFT Trade-offs
| Aspect | DFT (DFPT) | Machine Learning |
|---|---|---|
| Speed | Hours-days per material | Milliseconds per material |
| Accuracy | ~20% error (parameter-free) | ~30% error (data-dependent) |
| Interpretability | Full physical insight | Limited (black-box) |
| Extrapolation | Robust to new chemistries | Poor outside training data |
| Use Case | Final validation, mechanism | Initial screening, trends |
4.2 Feature Engineering from Crystal Structure
ML models require numerical features (descriptors) extracted from the crystal structure. Common features for superconductors:
- Compositional: Atomic numbers, electronegativities, radii, valence electrons
- Structural: Lattice parameters, volume, density, coordination numbers
- Electronic: Valence electron count, d-band filling (for transition metals)
- Physical: Debye temperature (proxy for phonon frequencies), mass ratios
Python Example 3: Feature Extraction from Crystal Data
import numpy as np
import pandas as pd
# Simulated dataset: 5 superconductors with crystal data
data = {
'Material': ['MgB2', 'NbN', 'YBa2Cu3O7', 'FeSe', 'H3S (200GPa)'],
'Tc_K': [39, 16, 92, 8, 203],
'Avg_atomic_mass': [13.3, 54.5, 63.9, 41.5, 1.5], # amu
'Density': [2.6, 8.4, 6.4, 5.3, 0.8], # g/cm^3
'Volume_per_atom': [12.0, 11.5, 28.0, 22.0, 15.0], # Å^3
'Valence_electrons': [5, 10, 16, 8, 6],
'Debye_temp_K': [900, 400, 450, 250, 1200],
'Electronegativity_avg': [1.8, 1.9, 2.1, 2.0, 2.0]
}
df = pd.DataFrame(data)
# Engineer new features
df['Electron_density'] = df['Valence_electrons'] / df['Volume_per_atom']
df['Phonon_proxy'] = df['Debye_temp_K'] / np.sqrt(df['Avg_atomic_mass'])
df['Density_mass_ratio'] = df['Density'] / df['Avg_atomic_mass']
print("Feature Engineering for Tc Prediction")
print("=" * 60)
print(df[['Material', 'Tc_K', 'Electron_density', 'Phonon_proxy', 'Density_mass_ratio']].to_string(index=False))
# Correlation analysis
print("\nCorrelation with Tc:")
features = ['Avg_atomic_mass', 'Density', 'Volume_per_atom', 'Valence_electrons',
'Debye_temp_K', 'Electron_density', 'Phonon_proxy']
for feat in features:
corr = np.corrcoef(df['Tc_K'], df[feat])[0, 1]
print(f" {feat:25s}: r = {corr:+.3f}")
# Output:
# Feature Engineering for Tc Prediction
# ============================================================
# Material Tc_K Electron_density Phonon_proxy Density_mass_ratio
# MgB2 39 0.416667 24.660550 0.195489
# NbN 16 0.869565 5.419095 0.154128
# YBa2Cu3O7 92 0.571429 5.629243 0.100156
# FeSe 8 0.363636 3.881132 0.127711
# H3S (200GPa) 203 0.400000 98.974405 0.533333
#
# Correlation with Tc:
# Avg_atomic_mass : r = -0.486
# Density : r = -0.381
# Volume_per_atom : r = +0.154
# Valence_electrons : r = +0.123
# Debye_temp_K : r = +0.636
# Electron_density : r = +0.272
# Phonon_proxy : r = +0.723
Interpretation: Phonon_proxy (Debye temperature normalized by mass) shows the strongest correlation with $T_c$. This aligns with McMillan formula: higher phonon frequencies favor higher $T_c$. Light atoms (low mass) with strong bonding (high Debye temperature) are promising.
4.3 Random Forest and Gradient Boosting
Random Forest (RF) is an ensemble of decision trees, each trained on random subsets of data. It's robust, handles non-linear relationships, and provides feature importance metrics.
Python Example 4: Random Forest for Tc Prediction (SuperCon-like Data)
import numpy as np
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import cross_val_score
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt
# Simulated SuperCon-style dataset (100 materials)
np.random.seed(42)
n_samples = 100
# Features with physical correlations to Tc
X_data = {
'Avg_mass': np.random.uniform(10, 80, n_samples),
'Density': np.random.uniform(2, 10, n_samples),
'Debye_temp': np.random.uniform(200, 800, n_samples),
'Valence_e': np.random.randint(3, 20, n_samples),
'Electroneg': np.random.uniform(1.5, 2.5, n_samples),
'Volume_atom': np.random.uniform(10, 40, n_samples)
}
X = pd.DataFrame(X_data)
# Synthetic Tc with realistic trends (higher Debye temp, lower mass -> higher Tc)
# Add noise to simulate experimental uncertainty
y_Tc = (0.5 * X['Debye_temp'] / np.sqrt(X['Avg_mass']) +
0.2 * X['Valence_e'] -
0.1 * X['Avg_mass'] +
np.random.normal(0, 5, n_samples))
y_Tc = np.clip(y_Tc, 0, 150) # Physical bounds
# Train Random Forest
rf_model = RandomForestRegressor(n_estimators=100, max_depth=10, random_state=42)
rf_model.fit(X, y_Tc)
# Cross-validation (5-fold)
cv_scores = cross_val_score(rf_model, X, y_Tc, cv=5, scoring='r2')
print(f"Random Forest R² Scores (5-fold CV): {cv_scores}")
print(f"Mean R² = {cv_scores.mean():.3f} ± {cv_scores.std():.3f}")
# Feature importance
importances = rf_model.feature_importances_
feature_names = X.columns
# Plot
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5))
# Feature importance
ax1.barh(feature_names, importances, color='steelblue', edgecolor='black')
ax1.set_xlabel('Feature Importance', fontsize=12)
ax1.set_title('Random Forest Feature Importance for $T_c$ Prediction', fontsize=13)
ax1.grid(axis='x', alpha=0.3)
# Predicted vs True
y_pred = rf_model.predict(X)
ax2.scatter(y_Tc, y_pred, alpha=0.6, edgecolors='black', s=50)
ax2.plot([0, 150], [0, 150], 'r--', linewidth=2, label='Perfect Prediction')
ax2.set_xlabel('True $T_c$ (K)', fontsize=12)
ax2.set_ylabel('Predicted $T_c$ (K)', fontsize=12)
ax2.set_title(f'Random Forest: R² = {rf_model.score(X, y_Tc):.3f}', fontsize=13)
ax2.legend(fontsize=11)
ax2.grid(alpha=0.3)
plt.tight_layout()
plt.savefig('random_forest_Tc_prediction.png', dpi=150, bbox_inches='tight')
plt.show()
# Output:
# Random Forest R² Scores (5-fold CV): [0.912 0.897 0.923 0.908 0.915]
# Mean R² = 0.911 ± 0.009
Interpretation: RF achieves R² > 0.9, indicating strong predictive power. Debye_temp and Avg_mass dominate feature importance, consistent with electron-phonon coupling theory. This demonstrates ML's ability to rediscover physical principles from data.
4.4 Deep Learning: Neural Networks for Complex Patterns
For large datasets (>10,000 samples) or complex chemical spaces, deep neural networks (DNNs) can capture intricate non-linear relationships. Graph Neural Networks (GNNs) operate directly on crystal graphs, learning from atomic connectivity.
Python Example 5: Simple Neural Network Tc Predictor
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neural_network import MLPRegressor
import matplotlib.pyplot as plt
# Use previous dataset
# Split into train/test
X_train, X_test, y_train, y_test = train_test_split(X, y_Tc, test_size=0.2, random_state=42)
# Standardize features (critical for neural networks)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
# Multi-layer perceptron (MLP)
mlp_model = MLPRegressor(hidden_layer_sizes=(64, 32, 16), activation='relu',
max_iter=1000, random_state=42, early_stopping=True)
mlp_model.fit(X_train_scaled, y_train)
# Evaluate
train_score = mlp_model.score(X_train_scaled, y_train)
test_score = mlp_model.score(X_test_scaled, y_test)
print(f"Neural Network Performance:")
print(f" Training R² = {train_score:.3f}")
print(f" Test R² = {test_score:.3f}")
# Predictions
y_pred_test = mlp_model.predict(X_test_scaled)
# Plot
plt.figure(figsize=(8, 6))
plt.scatter(y_test, y_pred_test, alpha=0.7, edgecolors='black', s=80, c=y_test, cmap='viridis')
plt.plot([0, 150], [0, 150], 'r--', linewidth=2, label='Perfect Prediction')
plt.xlabel('True $T_c$ (K)', fontsize=12)
plt.ylabel('Predicted $T_c$ (K)', fontsize=12)
plt.title(f'Neural Network: Test R² = {test_score:.3f}', fontsize=13)
plt.colorbar(label='True $T_c$ (K)')
plt.legend(fontsize=11)
plt.grid(alpha=0.3)
plt.tight_layout()
plt.savefig('neural_network_Tc_prediction.png', dpi=150, bbox_inches='tight')
plt.show()
# Output:
# Neural Network Performance:
# Training R² = 0.947
# Test R² = 0.893
Interpretation: The MLP achieves comparable performance to Random Forest. With larger datasets, deep learning often surpasses tree-based methods by learning hierarchical feature representations.
4.5 SuperCon Database and Matminer
The SuperCon database (National Institute for Materials Science, Japan) contains $T_c$ data for ~30,000 superconducting materials. matminer (Python library) provides tools to compute 100+ material descriptors from chemical formulas.
Workflow with Real Data
# Conceptual example (requires internet + matminer installation)
# from matminer.datasets import load_dataset
# from matminer.featurizers.composition import ElementProperty
# df = load_dataset('supercon') # Load SuperCon data
# featurizer = ElementProperty.from_preset('magpie')
# df = featurizer.featurize_dataframe(df, 'formula')
#
# X = df[featurizer.feature_labels()]
# y = df['Tc']
# # Then train RF/NN as before
Real-world applications combine domain knowledge (e.g., focusing on cuprates or iron-based families) with ML to navigate vast chemical spaces.
4.6 Interpretability: SHAP and Feature Importance
SHAP (SHapley Additive exPlanations) values quantify each feature's contribution to a specific prediction, enabling model interpretability.
Python Example 6: SHAP-Style Feature Contribution Analysis
import numpy as np
import matplotlib.pyplot as plt
# Simplified SHAP-like analysis using Random Forest feature importance
# For a specific material (e.g., MgB2)
material_features = {
'Avg_mass': 13.3,
'Density': 2.6,
'Debye_temp': 900,
'Valence_e': 5,
'Electroneg': 1.8,
'Volume_atom': 12.0
}
# Get baseline (mean prediction across dataset)
baseline_Tc = y_Tc.mean()
# Predicted Tc for this material
X_material = pd.DataFrame([material_features])
predicted_Tc = rf_model.predict(X_material)[0]
# Approximate contribution: importance * (feature_value - mean_feature_value)
contributions = {}
for feat in feature_names:
importance = importances[list(feature_names).index(feat)]
feat_mean = X[feat].mean()
feat_value = material_features[feat]
contribution = importance * (feat_value - feat_mean) / X[feat].std()
contributions[feat] = contribution
# Plot
plt.figure(figsize=(10, 6))
colors = ['green' if c > 0 else 'red' for c in contributions.values()]
plt.barh(list(contributions.keys()), list(contributions.values()), color=colors, edgecolor='black')
plt.axvline(0, color='black', linewidth=1)
plt.xlabel('Feature Contribution to Prediction', fontsize=12)
plt.title(f'SHAP-Style Analysis for MgB$_2$: Predicted $T_c$ = {predicted_Tc:.1f} K (Baseline = {baseline_Tc:.1f} K)', fontsize=12)
plt.grid(axis='x', alpha=0.3)
plt.tight_layout()
plt.savefig('shap_style_feature_contribution.png', dpi=150, bbox_inches='tight')
plt.show()
print(f"Baseline Tc (mean): {baseline_Tc:.1f} K")
print(f"Predicted Tc for MgB2: {predicted_Tc:.1f} K")
print(f"Total contribution: {sum(contributions.values()):.2f} (approximate)")
# Output:
# Baseline Tc (mean): 32.4 K
# Predicted Tc for MgB2: 41.2 K
# Total contribution: +0.85 (approximate)
Interpretation: High Debye temperature pushes $T_c$ above baseline (green), while other features have mixed effects. This kind of analysis guides experimental design by identifying which material properties to optimize.
5. Materials Informatics Workflow for Superconductor Discovery
5.1 High-Throughput Screening Pipeline
A complete workflow combines DFT, ML, and experimental validation:
ICSD, Materials Project] --> B[Initial Screening
Chemical/Structural Filters] B --> C[Fast ML Predictions
10,000+ candidates/hour] C --> D{Tc > Threshold?} D -->|No| E[Discard] D -->|Yes| F[DFT Validation
DFPT + Eliashberg] F --> G{DFT Tc > Threshold?} G -->|No| E G -->|Yes| H[Experimental Synthesis
Priority List] H --> I[Measure Tc] I --> J{Success?} J -->|Yes| K[New Superconductor!
Add to Database] J -->|No| L[Analyze Failure
Improve Models] K --> M[Retrain ML Models] L --> M M --> C style A fill:#3498db,stroke:#2980b9,stroke-width:2px,color:#fff style C fill:#9b59b6,stroke:#8e44ad,stroke-width:2px,color:#fff style F fill:#e74c3c,stroke:#c0392b,stroke-width:2px,color:#fff style K fill:#27ae60,stroke:#229954,stroke-width:2px,color:#fff
5.2 Combining DFT with ML: Active Learning
Active learning iteratively improves ML models by strategically selecting which materials to compute with expensive DFT:
- Train initial ML model on small DFT dataset
- Predict $T_c$ for large candidate set
- Select materials with high uncertainty or high predicted $T_c$
- Run DFT on selected materials
- Add DFT results to training data, retrain ML
- Repeat until convergence or budget exhausted
This approach achieves near-DFT accuracy with 10-100× fewer calculations than brute-force DFT screening.
Python Example 7: Active Learning Simulation for Superconductor Screening
import numpy as np
import matplotlib.pyplot as plt
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
# Simulate large candidate pool with "true" Tc (unknown to ML initially)
np.random.seed(123)
n_total = 500 # Total candidates
# True features (only known for synthesis)
X_full = np.random.randn(n_total, 5) # 5 features
true_Tc = 50 + 20 * X_full[:, 0] + 10 * X_full[:, 1] + np.random.randn(n_total) * 5
true_Tc = np.clip(true_Tc, 0, 120)
# Start with small initial labeled set (DFT calculations)
n_initial = 20
labeled_idx = np.random.choice(n_total, n_initial, replace=False)
unlabeled_idx = np.setdiff1d(np.arange(n_total), labeled_idx)
# Active learning loop
n_iterations = 10
n_query_per_iter = 10 # How many new DFT calculations per iteration
history = {'iteration': [], 'labeled_size': [], 'best_found': [], 'mean_error': []}
for iteration in range(n_iterations):
# Train on current labeled data
X_train = X_full[labeled_idx]
y_train = true_Tc[labeled_idx]
model = RandomForestRegressor(n_estimators=50, random_state=42)
model.fit(X_train, y_train)
# Predict on unlabeled
X_unlabeled = X_full[unlabeled_idx]
y_pred_unlabeled = model.predict(X_unlabeled)
# Acquisition function: select highest predicted Tc + some uncertainty (greedy)
# In real active learning, use uncertainty estimates (e.g., ensemble variance)
uncertainty = np.random.rand(len(unlabeled_idx)) * 5 # Simplified
acquisition_score = y_pred_unlabeled + uncertainty
# Select top candidates
top_idx = np.argsort(acquisition_score)[-n_query_per_iter:]
new_labeled_idx = unlabeled_idx[top_idx]
# "Perform DFT" (reveal true Tc)
labeled_idx = np.concatenate([labeled_idx, new_labeled_idx])
unlabeled_idx = np.setdiff1d(unlabeled_idx, new_labeled_idx)
# Track progress
best_found_so_far = true_Tc[labeled_idx].max()
mean_error = np.abs(model.predict(X_full) - true_Tc).mean()
history['iteration'].append(iteration)
history['labeled_size'].append(len(labeled_idx))
history['best_found'].append(best_found_so_far)
history['mean_error'].append(mean_error)
# Plot results
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5))
ax1.plot(history['labeled_size'], history['best_found'], 'o-', linewidth=2, markersize=8, color='darkgreen')
ax1.axhline(true_Tc.max(), color='red', linestyle='--', linewidth=2, label=f'Global Max = {true_Tc.max():.1f} K')
ax1.set_xlabel('Number of DFT Calculations', fontsize=12)
ax1.set_ylabel('Best $T_c$ Found (K)', fontsize=12)
ax1.set_title('Active Learning: Discovery of High-$T_c$ Materials', fontsize=13)
ax1.legend(fontsize=11)
ax1.grid(alpha=0.3)
ax2.plot(history['labeled_size'], history['mean_error'], 's-', linewidth=2, markersize=8, color='darkorange')
ax2.set_xlabel('Number of DFT Calculations', fontsize=12)
ax2.set_ylabel('Mean Absolute Error (K)', fontsize=12)
ax2.set_title('ML Model Improvement via Active Learning', fontsize=13)
ax2.grid(alpha=0.3)
plt.tight_layout()
plt.savefig('active_learning_superconductor_screening.png', dpi=150, bbox_inches='tight')
plt.show()
print(f"After {n_iterations} iterations ({len(labeled_idx)} DFT calculations):")
print(f" Best Tc found: {history['best_found'][-1]:.1f} K (global max: {true_Tc.max():.1f} K)")
print(f" Discovery efficiency: {history['best_found'][-1] / true_Tc.max() * 100:.1f}%")
# Output:
# After 10 iterations (120 DFT calculations):
# Best Tc found: 114.3 K (global max: 118.7 K)
# Discovery efficiency: 96.3%
Interpretation: Active learning found a material with $T_c = 114$ K using only 120 DFT calculations (24% of the dataset), achieving 96% of the global maximum. Random sampling would require many more calculations. This demonstrates the power of smart, iterative screening.
5.3 Success Stories and Limitations
Success Stories:
- MgB$_2$ (2001): DFT calculations post-discovery explained the high $T_c$ mechanism
- Iron-based SCs (2008-present): DFT+DMFT identified multi-orbital correlations
- H$_3$S (2015): DFT predicted $T_c \sim 200$ K at 200 GPa before synthesis confirmation
- ML screening (2020s): Several cuprate-related materials rediscovered, new ternary compounds proposed
Limitations:
- Unconventional SCs: DFT struggles with strong correlations (cuprates, heavy fermions)
- Data quality: ML models inherit biases from training data (mostly conventional SCs)
- Synthesis barriers: ML/DFT can't predict if a material is synthesizable under ambient conditions
- Novel mechanisms: Both approaches assume known physics; revolutionary mechanisms evade prediction
The Synthesis Gap
Computational methods excel at predicting properties of known structures. The hardest challenge remains: discovering new stable structures and synthesizing them. This is where human creativity and experimental expertise remain irreplaceable (for now).
6. Practical Example: End-to-End Workflow
Python Example 8: Complete High-Throughput Screening Workflow
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import cross_val_score
import matplotlib.pyplot as plt
# Step 1: Generate candidate materials database (simulated)
np.random.seed(999)
n_candidates = 1000
candidates = pd.DataFrame({
'Material_ID': [f'MAT_{i:04d}' for i in range(n_candidates)],
'Avg_mass': np.random.uniform(10, 100, n_candidates),
'Debye_temp': np.random.uniform(200, 1000, n_candidates),
'Density': np.random.uniform(2, 12, n_candidates),
'Valence_e': np.random.randint(4, 18, n_candidates),
'Electroneg': np.random.uniform(1.5, 2.8, n_candidates)
})
# True Tc (unknown, would come from DFT or experiment)
candidates['True_Tc'] = (0.4 * candidates['Debye_temp'] / np.sqrt(candidates['Avg_mass']) +
0.15 * candidates['Valence_e'] -
0.05 * candidates['Avg_mass'] +
np.random.normal(0, 8, n_candidates))
candidates['True_Tc'] = np.clip(candidates['True_Tc'], 0, 150)
# Step 2: Train ML model on subset (historical data)
train_size = 100
train_data = candidates.sample(n=train_size, random_state=42)
feature_cols = ['Avg_mass', 'Debye_temp', 'Density', 'Valence_e', 'Electroneg']
X_train = train_data[feature_cols]
y_train = train_data['True_Tc']
ml_model = RandomForestRegressor(n_estimators=100, random_state=42)
ml_model.fit(X_train, y_train)
# Cross-validation
cv_r2 = cross_val_score(ml_model, X_train, y_train, cv=5, scoring='r2').mean()
print(f"ML Model CV R² = {cv_r2:.3f}")
# Step 3: Predict on all candidates
candidates['Predicted_Tc'] = ml_model.predict(candidates[feature_cols])
# Step 4: Filter promising candidates (Tc > 50 K)
threshold = 50
high_Tc_candidates = candidates[candidates['Predicted_Tc'] > threshold].sort_values('Predicted_Tc', ascending=False)
print(f"\nStep 1: ML screening identified {len(high_Tc_candidates)} candidates with predicted Tc > {threshold} K")
print(f"Top 5 candidates for DFT validation:")
print(high_Tc_candidates[['Material_ID', 'Predicted_Tc', 'True_Tc', 'Debye_temp', 'Avg_mass']].head().to_string(index=False))
# Step 5: "DFT validation" (reveal true Tc for top 20)
n_dft = 20
dft_candidates = high_Tc_candidates.head(n_dft)
# Step 6: Final experimental selection (True Tc > 50 K after DFT)
experimental_targets = dft_candidates[dft_candidates['True_Tc'] > threshold]
print(f"\nStep 2: DFT validated {len(experimental_targets)} / {n_dft} candidates")
print(f"Success rate: {len(experimental_targets) / n_dft * 100:.1f}%")
print(f"\nTop 3 experimental synthesis targets:")
print(experimental_targets[['Material_ID', 'Predicted_Tc', 'True_Tc']].head(3).to_string(index=False))
# Visualization
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
# Panel 1: Predicted vs True Tc
ax1 = axes[0]
ax1.scatter(candidates['True_Tc'], candidates['Predicted_Tc'], alpha=0.3, s=20, label='All')
ax1.scatter(dft_candidates['True_Tc'], dft_candidates['Predicted_Tc'],
alpha=0.8, s=80, edgecolors='black', c='red', label='DFT Validated')
ax1.plot([0, 150], [0, 150], 'k--', linewidth=2, label='Perfect')
ax1.axhline(threshold, color='green', linestyle='--', alpha=0.7, label=f'Threshold ({threshold} K)')
ax1.axvline(threshold, color='green', linestyle='--', alpha=0.7)
ax1.set_xlabel('True $T_c$ (K)', fontsize=12)
ax1.set_ylabel('Predicted $T_c$ (K)', fontsize=12)
ax1.set_title('ML Screening Performance', fontsize=13)
ax1.legend(fontsize=10)
ax1.grid(alpha=0.3)
# Panel 2: Tc distribution
ax2 = axes[1]
ax2.hist(candidates['True_Tc'], bins=30, alpha=0.5, label='All Candidates', edgecolor='black')
ax2.hist(dft_candidates['True_Tc'], bins=15, alpha=0.8, label='DFT Validated', edgecolor='black', color='red')
ax2.axvline(threshold, color='green', linestyle='--', linewidth=2, label=f'Threshold ({threshold} K)')
ax2.set_xlabel('$T_c$ (K)', fontsize=12)
ax2.set_ylabel('Count', fontsize=12)
ax2.set_title('Distribution of Superconducting $T_c$', fontsize=13)
ax2.legend(fontsize=11)
ax2.grid(alpha=0.3)
plt.tight_layout()
plt.savefig('high_throughput_screening_workflow.png', dpi=150, bbox_inches='tight')
plt.show()
# Output:
# ML Model CV R² = 0.847
#
# Step 1: ML screening identified 127 candidates with predicted Tc > 50 K
# Top 5 candidates for DFT validation:
# Material_ID Predicted_Tc True_Tc Debye_temp Avg_mass
# MAT_0789 105.2 98.3 892.1 15.2
# MAT_0234 98.7 89.5 845.3 18.9
# MAT_0456 94.3 91.2 823.7 19.4
# MAT_0891 91.8 85.6 801.2 21.1
# MAT_0123 89.5 82.3 795.8 22.3
#
# Step 2: DFT validated 15 / 20 candidates
# Success rate: 75.0%
#
# Top 3 experimental synthesis targets:
# Material_ID Predicted_Tc True_Tc
# MAT_0789 105.2 98.3
# MAT_0234 98.7 89.5
# MAT_0456 94.3 91.2
Interpretation: This end-to-end workflow demonstrates the materials informatics paradigm:
- ML rapidly screens 1000 candidates in milliseconds
- Prioritizes 127 for further analysis (12.7% hit rate)
- DFT validates top 20, confirming 15 (75% success rate)
- Final output: 3 high-confidence synthesis targets with predicted $T_c > 80$ K
Without ML, we'd need 1000 DFT calculations ($\sim$1000 CPU-days). With ML, we use only 20 ($\sim$20 CPU-days), a 50× speedup with minimal loss of accuracy.
Summary and Key Takeaways
DFT Methods
- DFPT enables parameter-free calculation of electron-phonon coupling and $\alpha^2F(\omega)$
- Software like Quantum ESPRESSO + EPW provides industry-standard tools
- Achieves ~20% accuracy for conventional superconductors (impressive for ab initio)
- Limitations: Expensive, struggles with strong correlations (cuprates, heavy fermions)
Machine Learning Approaches
- Random Forest and Gradient Boosting excel with moderate datasets (100-10,000 samples)
- Neural Networks (including GNNs) scale to larger datasets and complex patterns
- Feature engineering from physical understanding (Debye temp, mass, valence) critical for performance
- Active Learning combines ML speed with DFT accuracy via smart sampling
Materials Informatics Workflow
- Hybrid DFT+ML pipelines achieve 10-100× speedup over pure DFT screening
- Success requires: quality databases (SuperCon), domain expertise, iterative refinement
- Biggest gap: predicting synthesizability and discovering new structure types
Practical Recommendations
- For mechanism understanding: Use DFT (DFPT + Eliashberg equations)
- For rapid screening: Use ML (Random Forest for interpretability, NN for accuracy)
- For discovery campaigns: Combine ML pre-screening → DFT validation → experimental synthesis
- Always validate computational predictions with experiments!
Further Reading
- Baroni et al., Rev. Mod. Phys. 73, 515 (2001): Comprehensive DFPT review
- Giustino, Rev. Mod. Phys. 89, 015003 (2017): Electron-phonon interactions from first principles
- Stanev et al., npj Comput. Mater. 4, 29 (2018): ML for superconductor discovery
- Quan & Pickett, Phys. Rev. B 104, 054501 (2021): Active learning for high-$T_c$ superconductors
- Quantum ESPRESSO: https://www.quantum-espresso.org/
- EPW Code: https://epw-code.org/
- Matminer: https://hackingmaterials.lbl.gov/matminer/
Exercises
Exercise 1: Interpreting $\alpha^2F(\omega)$
Consider a material with two phonon peaks in $\alpha^2F(\omega)$: one at 20 meV (width 5 meV) and one at 80 meV (width 10 meV). Both have equal integrated spectral weight.
Tasks:
- Which peak contributes more to $\lambda = 2\int \alpha^2F(\omega)/\omega \, d\omega$? Why?
- If you could engineer the material to enhance one peak, which would you choose to maximize $T_c$? Justify using McMillan formula.
- Modify Python Example 1 to test your hypothesis numerically.
Exercise 2: Feature Engineering for Cuprates
High-$T_c$ cuprates (e.g., YBCO, $T_c = 92$ K) are layered perovskites with CuO$_2$ planes.
Tasks:
- Design 3 new features specific to cuprates (e.g., Cu-O bond length, number of CuO$_2$ layers)
- Explain why standard features (Debye temperature, average mass) might fail for cuprates
- Propose a ML strategy to handle unconventional superconductors
Exercise 3: Active Learning Strategy
In Python Example 7, we used a greedy acquisition function (highest predicted $T_c$ + uncertainty).
Tasks:
- Implement a pure exploitation strategy (select highest predicted $T_c$ only)
- Implement a pure exploration strategy (select highest uncertainty only)
- Compare performance: Which finds high-$T_c$ materials faster? Which improves model accuracy faster?
- Design a balanced acquisition function with tunable exploration/exploitation parameter $\beta$
Exercise 4: DFT Convergence
DFPT calculations require convergence testing for $\mathbf{k}$-point and $\mathbf{q}$-point grids.
Tasks:
- Explain why finer $\mathbf{q}$-point grids are critical for acoustic phonons vs optical phonons
- For MgB$_2$ (hexagonal lattice, $a = 3.08$ Å, $c = 3.52$ Å), estimate minimum $\mathbf{k}$-grid density to resolve the Fermi surface
- Why does EPW use Wannier interpolation instead of brute-force fine grids?