High-Throughput Computing Introduction Series
Complete guide to accelerating materials discovery 1000x through automation and parallelizationSeries Overview
This series is a comprehensive 5-chapter educational resource for researchers and engineers who want to learn High-Throughput Computational Materials Science (HTCMS). It covers practical skills from DFT calculation automation to workflow management, parallel computing, and cloud HPC utilization, with step-by-step progression.
Features:- β Practice-oriented: Implementation examples using ASE, pymatgen, FireWorks, and AiiDA
- β Comprehensive coverage: From automation design to large-scale parallel computing
- β Industrial applications: Success stories from Materials Project, AFLOW, and more
- β Reproducibility: Fully reproducible with Docker and environment setup scripts
---
How to Study
Recommended Learning Sequence
- Chapter 1 β Chapter 2 β Chapter 3 β Chapter 4 β Chapter 5
- Time required: 110-140 minutes
- Chapter 2 β Chapter 4
- Time required: 50-60 minutes
- Chapter 1 β Chapter 5
- Time required: 35-45 minutes
---
Chapter Details
Chapter 1: The Need for High-Throughput Computing and Workflow Design
Difficulty: Intermediate Reading time: 20-30 minutesLearning Content
- Challenges in materials discovery
- Vast search space: 10^60 combinations
- Limitations of traditional methods: from 1 material/week β 1000 materials/week needed
- Materials Genome Initiative (MGI) goals
- Definition of High-Throughput Computing
- Automation
- Parallelization
- Standardization
- Data Management
- Success stories
- Materials Project: DFT calculations of 140,000 materials
- AFLOW: Automated analysis of 3,500,000 crystal structures
- OQMD: Thermodynamic data of 815,000 materials
- JARVIS: Diverse property calculations for 40,000 materials
- Workflow design principles
- Modularity
- Error Handling
- Reproducibility
- Scalability
- Cost and benefits
- Development time: 15-20 years β 3-5 years (67% reduction)
- Experimental cost: 95% reduction
- Computational cost: Initial investment vs long-term ROI
Learning Objectives
- β Explain the four elements of High-Throughput Computing
- β Analyze the success factors of Materials Project
- β Understand workflow design principles
- β Quantitatively evaluate cost reduction effects
---
Chapter 2: DFT Calculation Automation (VASP, Quantum ESPRESSO)
Difficulty: Intermediate to Advanced Reading time: 20-25 minutes Code examples: 6Learning Content
- ASE (Atomic Simulation Environment) basics
- Installation and environment setup
- Structure generation and manipulation
- Calculator interface
- Automated result analysis
- VASP automation
- Automatic INCAR file generation
- Automatic K-point settings
- POTCAR management
- Automated convergence checking
- Quantum ESPRESSO automation
- Input file templates
- Pseudopotential management
- SCF/NSCF/DOS calculation chains
- Automatic band structure plotting
- Advanced automation with pymatgen
- InputSet: Standardized input generation
- Taskflow: Calculation flow definition
- Error detection and restart
- Result database integration
- Structure optimization automation
- Relaxation calculation convergence checking
- Lattice constant optimization
- Internal coordinate optimization
- Automatic energy cutoff determination
- Troubleshooting
- Common errors and solutions
- Diagnosing non-convergent calculations
- Handling memory shortages
Learning Objectives
- β Execute basic DFT calculations automatically using ASE
- β Auto-generate VASP and Quantum ESPRESSO input files
- β Master pymatgen InputSet usage
- β Detect errors and auto-restart calculations
---
Chapter 3: Job Scheduling and Parallelization (SLURM, PBS)
Difficulty: Advanced Reading time: 25-30 minutes Code examples: 5-6Learning Content
- Job scheduler fundamentals
- SLURM vs PBS vs Torque
- Queue system mechanisms
- Resource request optimization
- Priority and fairness
- SLURM script creation
- Header: #SBATCH directives
- Node count, core count, memory requests
- Time limit settings
- Array jobs for parallel execution
- MPI parallel computing
- MPI parallelization principles
- Running VASP/QE with MPI
- Inter-node communication optimization
- Scaling efficiency evaluation
- Job management with Python
- Job submission via subprocess
- Job status monitoring
- Wait for completion and auto-resubmit
- Dependent job chains
- Large-scale parallel computing
- Simultaneous calculation of 1000 materials
- Avoiding resource contention
- Fail-safe design
- Computational cost optimization
- Benchmarking and tuning
- Measuring parallel efficiency
- Bottleneck analysis
- I/O optimization
- Memory bandwidth considerations
Learning Objectives
- β Create and submit SLURM scripts
- β Evaluate MPI parallel computing efficiency
- β Write Python job management scripts
- β Design parallel computations at 1000-material scale
---
Chapter 4: Data Management and Post-processing (FireWorks, AiiDA)
Difficulty: Advanced Reading time: 20-25 minutes Code examples: 6Learning Content
- Workflow management with FireWorks
- FireWorks architecture
- Firework (single task) definition
- Workflow (task chain) construction
- LaunchPad (database) setup
- Atomate workflows adopted by Materials Project
- Standard workflows: Structure optimization β Static calculation β Band structure
- Custom workflow creation
- Error handling and restart
- JSON output of results
- Provenance management with AiiDA
- Importance of data provenance tracking
- AiiDA data model
- WorkChain definition
- Querying and data search
- Structuring computational data
- JSON schema design
- MongoDB/SQLite selection
- Index optimization
- Version control
- Post-processing automation
- Automatic DOS/band structure plotting
- Phonon dispersion analysis
- Thermodynamic quantity calculation
- Automatic report generation
- Data sharing and archiving
- Uploading to NOMAD Repository
- Publishing on Materials Cloud
- DOI acquisition
- FAIR data principles
Learning Objectives
- β Build complex workflows with FireWorks
- β Master Atomate standard workflows
- β Record data provenance with AiiDA
- β Publish calculation results to NOMAD
---
Chapter 5: Cloud HPC Utilization and Optimization
Difficulty: Intermediate to Advanced Reading time: 15-20 minutes Code examples: 5Learning Content
- Cloud HPC options
- AWS Parallel Cluster
- Google Cloud HPC Toolkit
- Azure CycleCloud
- Dedicated HPC: TSUBAME, Fugaku
- AWS Parallel Cluster setup
- VPC/subnet configuration
- Cluster configuration file
- SLURM integration
- Storage (EFS/FSx)
- Cost optimization
- Spot instance utilization
- Auto-scaling
- Idle timeout
- Storage tiering
- Containerization with Docker/Singularity
- Complete environment reproduction
- Dockerfile best practices
- Singularity on HPC
- Image registry management
- Security and compliance
- Access control (IAM)
- Data encryption
- Log auditing
- Academic license compliance
- Case study: 10,000 materials screening
- Requirements definition
- Architecture design
- Implementation and execution
- Cost analysis (total $500-1000)
Learning Objectives
- β Build AWS Parallel Cluster
- β Reduce costs by 50% using spot instances
- β Fully reproduce computational environment with Docker
- β Design and execute 10,000-material scale projects
---
Overall Learning Outcomes
Upon completing this series, you will acquire the following skills and knowledge:
Knowledge Level (Understanding)
- β Explain the principles and necessity of High-Throughput Computing
- β Understand the Materials Project technology stack
- β Compare workflow management tools
- β Know cloud HPC options and their characteristics
Practical Skills (Doing)
- β Automate DFT calculations with ASE/pymatgen
- β Write SLURM scripts and submit parallel jobs
- β Build complex workflows with FireWorks/AiiDA
- β Execute large-scale calculations on AWS Parallel Cluster
- β Publish calculation results to NOMAD
Application (Applying)
- β Design 1000-material scale screening projects
- β Optimize computational costs to stay within budget
- β Build reproducible research workflows
- β Propose HT computing implementation in industry
---
Recommended Study Patterns
Pattern 1: Complete Mastery (for HTC beginners)
Target audience: Those with DFT calculation experience but new to automation Duration: 2 weeks Approach:Week 1:
- Day 1-2: Chapter 1 (conceptual understanding)
- Day 3-4: Chapter 2 (ASE/pymatgen implementation)
- Day 5-7: Chapter 3 (SLURM practice)
Week 2:
- Day 1-3: Chapter 4 (FireWorks/AiiDA)
- Day 4-5: Chapter 5 (Cloud HPC)
- Day 6-7: Integration project (100-material screening)
Deliverables:
- 100-material band gap prediction project
- GitHub-published workflow code
Pattern 2: Fast Track (for experienced users)
Target audience: Those with DFT automation experience who want to learn workflow management Duration: 3-5 days Approach:Day 1: Chapter 1 (can skip) + Chapter 2 (review)
Day 2-3: Chapter 4 (intensive FireWorks learning)
Day 4: Chapter 5 (cloud practice)
Day 5: Project integration
Pattern 3: Cloud Specialization
Target audience: Those considering migration from on-premise HPC to cloud Duration: 1 week Approach:Day 1-2: Chapter 1 + Chapter 5 (understand cloud options)
Day 3-4: AWS Parallel Cluster construction
Day 5-6: Porting existing workflows
Day 7: Cost optimization and benchmarking
---
Prerequisites
Required
- β DFT calculation fundamentals: Experience with VASP, Quantum ESPRESSO, or CP2K
- β Linux/UNIX commands: bash, ssh, file operations
- β Python basics: Functions, classes, modules
Recommended
- β Materials science basics: Crystal structures, band theory, density of states
- β Machine learning basics: Completed MI introduction series
- β Git/GitHub: Version control basics
Nice to have
- β Cluster computing experience: Job scheduler usage history
- β Database basics: SQL, JSON, MongoDB
- β Docker basics: Container concepts
---
Tools and Software
Required Tools
| Tool | Purpose | License | Installation |
|--------|------|-----------|-------------|
| Python 3.10+ | Script execution | Open | conda, pip |
| ASE | Structure manipulation & calculation | LGPL | pip install ase |
| pymatgen | Materials science computing | MIT | pip install pymatgen |
DFT Codes (at least one)
| Code | License | Features |
|--------|-----------|------|
| VASP | Commercial (academic license) | High accuracy, widely used |
| Quantum ESPRESSO | GPL | Open source, plane-wave basis |
| CP2K | GPL | Open source, hybrid basis |
Workflow Tools
| Tool | Adopting Project | Learning Difficulty |
|--------|----------------|-----------|
| FireWorks | Materials Project | Medium |
| AiiDA | MARVEL (Europe) | High |
| Atomate | Materials Project | Medium |
Cloud HPC
| Service | Recommended use | Initial cost |
|---------|---------|---------|
| AWS Parallel Cluster | Large-scale computing | $0 (pay-as-you-go) |
| Google Cloud HPC | ML integration | $0 (pay-as-you-go) |
| Azure CycleCloud | Windows integration | $0 (pay-as-you-go) |
---
FAQ (Frequently Asked Questions)
Q1: Can I take this course without DFT calculation experience?
A: Chapters 2 and beyond assume basic experience with VASP or Quantum ESPRESSO. If you are new to DFT, we recommend first learning the basics through a "Computational Materials Science Fundamentals" series.Q2: Do I need a commercial VASP license?
A: Most code examples can be executed with open-source Quantum ESPRESSO. VASP is included in examples because of its widespread use in industry, but if you don't have an academic license, you can use alternative codes.Q3: What if I don't have access to an HPC cluster?
A: Chapter 5 teaches how to use cloud HPC (AWS, Google Cloud). You can start without initial investmentβa budget of $100-200 is sufficient for initial learning.Q4: Should I learn FireWorks or AiiDA?
A: FireWorks is recommended if you want the same environment as Materials Project. AiiDA is popular in Europe and has strengths in data provenance tracking. Chapter 4 covers both, but we recommend starting with FireWorks.Q5: How much computational resources do I need?
A: For the learning phase:- Local PC: Chapters 1-2 (small test calculations)
- University cluster: Chapters 3-4 (parallel computing)
- Cloud: Chapter 5 (budget of $50-100)
For real projects, 1000 materials costs approximately $500-1000.
Q6: Is the Materials Project code publicly available?
A: Yes, the core technology of Materials Project is open-sourced at:- pymatgen: https://github.com/materialsproject/pymatgen
- FireWorks: https://github.com/materialsproject/fireworks
- Atomate: https://github.com/hackingmaterials/atomate
This series uses these tools.
Q7: Are the skills learned useful in industry?
A: Extremely useful. High-Throughput Computing is used in the following companies/research institutes:- Japan: Toyota, Panasonic, Mitsubishi Chemical, NIMS
- Overseas: Tesla, IBM Research, BASF, DuPont
MI engineer positions offer salaries of 7-15M JPY (Japan), $80-200K (US).
Q8: What should I be careful about when using calculation results in papers?
A: Please confirm the following:- Reproducibility: Specify calculation conditions (k-points, cutoff, etc.)
- License: Cite software used in the paper
- Data publication: Publish raw data on NOMAD etc. (recommended)
- Validation: Compare at least some results with experimental data
---
Next Steps
Recommended Actions After Series Completion
Immediate (within 1-2 weeks):- β Execute a small-scale project of 100-1000 materials
- β Publish calculation results to NOMAD
- β Publish workflow code to GitHub
- β Contribute to Materials Project codebase
- β Develop custom FireWorks workflows
- β Submit paper (including computational dataset publication)
- β Execute 10,000-material scale project
- β Accumulate cloud HPC cost optimization know-how
- β Present at international conferences (MRS, E-MRS)
- β Build HT computing system in laboratory/company
- β Publish proprietary database
- β Be recognized as an MI field expert
---
Related Series
- : Materials property prediction using machine learning
- MLP Introduction Series: Machine learning potentials
- Materials Database Utilization Introduction: Complete Materials Project guide
- GNN Introduction Series: Graph neural networks
---
Feedback and Support
Author
This series was created under Dr. Yusuke Hashimoto at Tohoku University as part of the Materials Informatics Dojo project.
Creation date: October 17, 2025 Version: 1.0Feedback
- Typos/technical errors: Report via GitHub repository Issues
- Improvement suggestions: New topics, additional code examples, etc.
- Questions: Difficult sections, areas needing additional explanation
---
License
This series is published under the CC BY 4.0 (Creative Commons Attribution 4.0 International) license.
Permitted:- β Free viewing and downloading
- β Educational use (classes, study groups, etc.)
- β Modification and derivative works (translation, summarization, etc.)
- π Author credit required
- π Modifications must be clearly indicated
---
Let's Get Started!
Are you ready? Start with Chapter 1 and begin your journey into the world of High-Throughput Computing!
Chapter 1: The Need for High-Throughput Computing and Workflow Design β---
Update History- 2025-10-17: v1.0 initial release
---
Materials discovery acceleration starts here!