🌐 EN | πŸ‡―πŸ‡΅ JP | Last sync: 2025-11-16

High-Throughput Computing Introduction Series

Accelerate materials discovery through automation and parallelization

πŸ“Š Level: Intermediate to Advanced

High-Throughput Computing Introduction Series

Complete guide to accelerating materials discovery 1000x through automation and parallelization

Series Overview

This series is a comprehensive 5-chapter educational resource for researchers and engineers who want to learn High-Throughput Computational Materials Science (HTCMS). It covers practical skills from DFT calculation automation to workflow management, parallel computing, and cloud HPC utilization, with step-by-step progression.

Features: Total study time: 110-140 minutes (including code execution and exercises)

---

How to Study

Recommended Learning Sequence

flowchart TD A[Chapter 1: Need for HTC] --> B[Chapter 2: DFT Automation] B --> C[Chapter 3: Job Scheduling] C --> D[Chapter 4: Data Management & Workflows] D --> E[Chapter 5: Cloud HPC Utilization] style A fill:#e3f2fd style B fill:#fff3e0 style C fill:#f3e5f5 style D fill:#e8f5e9 style E fill:#fce4ec
For DFT-experienced users (recommended): Automation focus only: Cloud utilization only:

---

Chapter Details

Chapter 1: The Need for High-Throughput Computing and Workflow Design

Difficulty: Intermediate Reading time: 20-30 minutes

Learning Content

  1. Challenges in materials discovery

- Vast search space: 10^60 combinations

- Limitations of traditional methods: from 1 material/week β†’ 1000 materials/week needed

- Materials Genome Initiative (MGI) goals

  1. Definition of High-Throughput Computing

- Automation

- Parallelization

- Standardization

- Data Management

  1. Success stories

- Materials Project: DFT calculations of 140,000 materials

- AFLOW: Automated analysis of 3,500,000 crystal structures

- OQMD: Thermodynamic data of 815,000 materials

- JARVIS: Diverse property calculations for 40,000 materials

  1. Workflow design principles

- Modularity

- Error Handling

- Reproducibility

- Scalability

  1. Cost and benefits

- Development time: 15-20 years β†’ 3-5 years (67% reduction)

- Experimental cost: 95% reduction

- Computational cost: Initial investment vs long-term ROI

Learning Objectives

Read Chapter 1 β†’

---

Chapter 2: DFT Calculation Automation (VASP, Quantum ESPRESSO)

Difficulty: Intermediate to Advanced Reading time: 20-25 minutes Code examples: 6

Learning Content

  1. ASE (Atomic Simulation Environment) basics

- Installation and environment setup

- Structure generation and manipulation

- Calculator interface

- Automated result analysis

  1. VASP automation

- Automatic INCAR file generation

- Automatic K-point settings

- POTCAR management

- Automated convergence checking

  1. Quantum ESPRESSO automation

- Input file templates

- Pseudopotential management

- SCF/NSCF/DOS calculation chains

- Automatic band structure plotting

  1. Advanced automation with pymatgen

- InputSet: Standardized input generation

- Taskflow: Calculation flow definition

- Error detection and restart

- Result database integration

  1. Structure optimization automation

- Relaxation calculation convergence checking

- Lattice constant optimization

- Internal coordinate optimization

- Automatic energy cutoff determination

  1. Troubleshooting

- Common errors and solutions

- Diagnosing non-convergent calculations

- Handling memory shortages

Learning Objectives

Read Chapter 2 β†’

---

Chapter 3: Job Scheduling and Parallelization (SLURM, PBS)

Difficulty: Advanced Reading time: 25-30 minutes Code examples: 5-6

Learning Content

  1. Job scheduler fundamentals

- SLURM vs PBS vs Torque

- Queue system mechanisms

- Resource request optimization

- Priority and fairness

  1. SLURM script creation

- Header: #SBATCH directives

- Node count, core count, memory requests

- Time limit settings

- Array jobs for parallel execution

  1. MPI parallel computing

- MPI parallelization principles

- Running VASP/QE with MPI

- Inter-node communication optimization

- Scaling efficiency evaluation

  1. Job management with Python

- Job submission via subprocess

- Job status monitoring

- Wait for completion and auto-resubmit

- Dependent job chains

  1. Large-scale parallel computing

- Simultaneous calculation of 1000 materials

- Avoiding resource contention

- Fail-safe design

- Computational cost optimization

  1. Benchmarking and tuning

- Measuring parallel efficiency

- Bottleneck analysis

- I/O optimization

- Memory bandwidth considerations

Learning Objectives

Read Chapter 3 β†’

---

Chapter 4: Data Management and Post-processing (FireWorks, AiiDA)

Difficulty: Advanced Reading time: 20-25 minutes Code examples: 6

Learning Content

  1. Workflow management with FireWorks

- FireWorks architecture

- Firework (single task) definition

- Workflow (task chain) construction

- LaunchPad (database) setup

  1. Atomate workflows adopted by Materials Project

- Standard workflows: Structure optimization β†’ Static calculation β†’ Band structure

- Custom workflow creation

- Error handling and restart

- JSON output of results

  1. Provenance management with AiiDA

- Importance of data provenance tracking

- AiiDA data model

- WorkChain definition

- Querying and data search

  1. Structuring computational data

- JSON schema design

- MongoDB/SQLite selection

- Index optimization

- Version control

  1. Post-processing automation

- Automatic DOS/band structure plotting

- Phonon dispersion analysis

- Thermodynamic quantity calculation

- Automatic report generation

  1. Data sharing and archiving

- Uploading to NOMAD Repository

- Publishing on Materials Cloud

- DOI acquisition

- FAIR data principles

Learning Objectives

Read Chapter 4 β†’

---

Chapter 5: Cloud HPC Utilization and Optimization

Difficulty: Intermediate to Advanced Reading time: 15-20 minutes Code examples: 5

Learning Content

  1. Cloud HPC options

- AWS Parallel Cluster

- Google Cloud HPC Toolkit

- Azure CycleCloud

- Dedicated HPC: TSUBAME, Fugaku

  1. AWS Parallel Cluster setup

- VPC/subnet configuration

- Cluster configuration file

- SLURM integration

- Storage (EFS/FSx)

  1. Cost optimization

- Spot instance utilization

- Auto-scaling

- Idle timeout

- Storage tiering

  1. Containerization with Docker/Singularity

- Complete environment reproduction

- Dockerfile best practices

- Singularity on HPC

- Image registry management

  1. Security and compliance

- Access control (IAM)

- Data encryption

- Log auditing

- Academic license compliance

  1. Case study: 10,000 materials screening

- Requirements definition

- Architecture design

- Implementation and execution

- Cost analysis (total $500-1000)

Learning Objectives

Read Chapter 5 β†’

---

Overall Learning Outcomes

Upon completing this series, you will acquire the following skills and knowledge:

Knowledge Level (Understanding)

Practical Skills (Doing)

Application (Applying)

---

Recommended Study Patterns

Pattern 1: Complete Mastery (for HTC beginners)

Target audience: Those with DFT calculation experience but new to automation Duration: 2 weeks Approach:
Week 1:
  • Day 1-2: Chapter 1 (conceptual understanding)
  • Day 3-4: Chapter 2 (ASE/pymatgen implementation)
  • Day 5-7: Chapter 3 (SLURM practice)

Week 2:

  • Day 1-3: Chapter 4 (FireWorks/AiiDA)
  • Day 4-5: Chapter 5 (Cloud HPC)
  • Day 6-7: Integration project (100-material screening)
Deliverables:

Pattern 2: Fast Track (for experienced users)

Target audience: Those with DFT automation experience who want to learn workflow management Duration: 3-5 days Approach:
Day 1: Chapter 1 (can skip) + Chapter 2 (review)

Day 2-3: Chapter 4 (intensive FireWorks learning)

Day 4: Chapter 5 (cloud practice)

Day 5: Project integration

Pattern 3: Cloud Specialization

Target audience: Those considering migration from on-premise HPC to cloud Duration: 1 week Approach:
Day 1-2: Chapter 1 + Chapter 5 (understand cloud options)

Day 3-4: AWS Parallel Cluster construction

Day 5-6: Porting existing workflows

Day 7: Cost optimization and benchmarking

---

Prerequisites

Required

Recommended

Nice to have

---

Tools and Software

Required Tools

| Tool | Purpose | License | Installation |

|--------|------|-----------|-------------|

| Python 3.10+ | Script execution | Open | conda, pip |

| ASE | Structure manipulation & calculation | LGPL | pip install ase |

| pymatgen | Materials science computing | MIT | pip install pymatgen |

DFT Codes (at least one)

| Code | License | Features |

|--------|-----------|------|

| VASP | Commercial (academic license) | High accuracy, widely used |

| Quantum ESPRESSO | GPL | Open source, plane-wave basis |

| CP2K | GPL | Open source, hybrid basis |

Workflow Tools

| Tool | Adopting Project | Learning Difficulty |

|--------|----------------|-----------|

| FireWorks | Materials Project | Medium |

| AiiDA | MARVEL (Europe) | High |

| Atomate | Materials Project | Medium |

Cloud HPC

| Service | Recommended use | Initial cost |

|---------|---------|---------|

| AWS Parallel Cluster | Large-scale computing | $0 (pay-as-you-go) |

| Google Cloud HPC | ML integration | $0 (pay-as-you-go) |

| Azure CycleCloud | Windows integration | $0 (pay-as-you-go) |

---

FAQ (Frequently Asked Questions)

Q1: Can I take this course without DFT calculation experience?

A: Chapters 2 and beyond assume basic experience with VASP or Quantum ESPRESSO. If you are new to DFT, we recommend first learning the basics through a "Computational Materials Science Fundamentals" series.

Q2: Do I need a commercial VASP license?

A: Most code examples can be executed with open-source Quantum ESPRESSO. VASP is included in examples because of its widespread use in industry, but if you don't have an academic license, you can use alternative codes.

Q3: What if I don't have access to an HPC cluster?

A: Chapter 5 teaches how to use cloud HPC (AWS, Google Cloud). You can start without initial investmentβ€”a budget of $100-200 is sufficient for initial learning.

Q4: Should I learn FireWorks or AiiDA?

A: FireWorks is recommended if you want the same environment as Materials Project. AiiDA is popular in Europe and has strengths in data provenance tracking. Chapter 4 covers both, but we recommend starting with FireWorks.

Q5: How much computational resources do I need?

A: For the learning phase:

For real projects, 1000 materials costs approximately $500-1000.

Q6: Is the Materials Project code publicly available?

A: Yes, the core technology of Materials Project is open-sourced at:

This series uses these tools.

Q7: Are the skills learned useful in industry?

A: Extremely useful. High-Throughput Computing is used in the following companies/research institutes:

MI engineer positions offer salaries of 7-15M JPY (Japan), $80-200K (US).

Q8: What should I be careful about when using calculation results in papers?

A: Please confirm the following:
  1. Reproducibility: Specify calculation conditions (k-points, cutoff, etc.)
  2. License: Cite software used in the paper
  3. Data publication: Publish raw data on NOMAD etc. (recommended)
  4. Validation: Compare at least some results with experimental data

---

Next Steps

Recommended Actions After Series Completion

Immediate (within 1-2 weeks):
  1. βœ… Execute a small-scale project of 100-1000 materials
  2. βœ… Publish calculation results to NOMAD
  3. βœ… Publish workflow code to GitHub
Short-term (1-3 months):
  1. βœ… Contribute to Materials Project codebase
  2. βœ… Develop custom FireWorks workflows
  3. βœ… Submit paper (including computational dataset publication)
Medium-term (3-6 months):
  1. βœ… Execute 10,000-material scale project
  2. βœ… Accumulate cloud HPC cost optimization know-how
  3. βœ… Present at international conferences (MRS, E-MRS)
Long-term (1 year+):
  1. βœ… Build HT computing system in laboratory/company
  2. βœ… Publish proprietary database
  3. βœ… Be recognized as an MI field expert

---

Related Series

---

Feedback and Support

Author

This series was created under Dr. Yusuke Hashimoto at Tohoku University as part of the Materials Informatics Dojo project.

Creation date: October 17, 2025 Version: 1.0

Feedback

Contact: yusuke.hashimoto.b8@tohoku.ac.jp

---

License

This series is published under the CC BY 4.0 (Creative Commons Attribution 4.0 International) license.

Permitted: Conditions:

---

Let's Get Started!

Are you ready? Start with Chapter 1 and begin your journey into the world of High-Throughput Computing!

Chapter 1: The Need for High-Throughput Computing and Workflow Design β†’

---

Update History

---

Materials discovery acceleration starts here!

Disclaimer