Chapter 1: The Role of Materials Informatics in Drug Discovery
This chapter covers The Role of Materials Informatics in Drug Discovery. You will learn 5 stages of drug discovery, Traditional drug discovery duration (12 years), and Three bottlenecks in drug discovery (time.
Transforming Drug Discovery - From Tradition to Innovation
1.1 Current Status and Challenges of the Drug Discovery Process
1.1.1 Traditional Drug Discovery Process
Developing new drugs is one of the most important scientific challenges in protecting human health. However, the journey is surprisingly long, difficult, and costly.
Typical Drug Discovery Timeline:
Real Numbers: - Development Duration: 10-15 years (average 12 years) - Success Rate: 0.01% (1 approved drug from 10,000 compounds) - Total Cost: $2.6B (approximately 260 billion yen) - Annual Approvals: Approximately 50 drugs (FDA, 2020-2023 average)
1.1.2 Five Stages of Drug Discovery
Stage 1: Target Identification
Objective: Identify proteins and genes involved in disease
Methods: - Genomics analysis (GWAS: Genome-Wide Association Studies) - Proteomics (protein expression profiling) - Bioinformatics (pathway analysis)
Challenges: - Difficult to validate target validity - Complexity of disease mechanisms (multifactorial) - High rate of false positives
Example: Alzheimer's Disease Targets - Amyloid Ξ² (AΞ²) accumulation - Tau protein abnormal phosphorylation - APOE4 gene mutations - Neuroinflammation (IL-1Ξ², TNF-Ξ±)
Stage 2: Lead Discovery
Objective: Find compounds that bind to the target and show activity
Methods: - High-Throughput Screening (HTS): Automated evaluation of hundreds of thousands to millions of compounds - Fragment-based Drug Discovery (FBDD): Starting from small molecular fragments - Virtual Screening (VS): Candidate compound selection using computational chemistry
Challenges: - Low activity of hit compounds (typically IC50 > 10 ΞΌM) - False positives (assay condition-dependent activity) - Patent issues (similarity to existing compounds)
Statistics: - HTS hit rate: 0.01-0.1% (1,000,000 compounds β 100-1,000 hits) - Progression to leads: 1-5% of hits (100 hits β 1-5 leads)
Stage 3: Lead Optimization
Objective: Improve activity, selectivity, and ADMET properties of lead compounds
Optimization Parameters: 1. Potency: Target IC50 < 100 nM 2. Selectivity: Minimize off-target effects 3. ADMET Properties: - Absorption: Oral absorption (Caco-2 > 10^-6 cm/s) - Distribution: Tissue distribution, BBB permeability - Metabolism: Metabolic stability (liver microsomes) - Excretion: Renal clearance - Toxicity: Hepatotoxicity, cardiotoxicity (hERG inhibition)
Challenges: - Multi-objective optimization (activity vs toxicity trade-offs) - Time-consuming structure-activity relationship (SAR) elucidation - Synthetic difficulty (complex chemical structures)
Example: Lipitor (atorvastatin, cholesterol-lowering drug) - Initial lead: IC50 = 50 nM - Post-optimization: IC50 = 2 nM (25-fold improvement) - Development period: 5 years, number of synthesized compounds: 1,000+
Stage 4: Preclinical Studies
Objective: Validate safety and efficacy in animal experiments
Tests Conducted: - In vivo efficacy studies: Mouse/rat disease models - Toxicity studies: Acute toxicity, subacute toxicity, chronic toxicity - Pharmacokinetics (PK): Blood concentration profiles, half-life - Pharmacodynamics (PD): Mechanisms of pharmacological action
Regulatory Requirements: - GLP (Good Laboratory Practice) compliance - Testing in 2 or more animal species - Carcinogenicity studies (for chronic disease therapeutics)
Failure Rate: 90% (10 compounds in preclinical β 1 compound to clinical)
Stage 5: Clinical Trials
Phase I: Healthy volunteers (20-100 people) - Objective: Safety, dose determination - Duration: 1-2 years - Success rate: 70%
Phase II: Patients (100-500 people) - Objective: Initial efficacy evaluation, adverse effects confirmation - Duration: 2-3 years - Success rate: 33%
Phase III: Patients (1,000-5,000 people) - Objective: Large-scale efficacy and safety validation - Duration: 2-4 years - Success rate: 25-30%
FDA Approval: 1-2 years - Application (NDA: New Drug Application): 100,000+ pages - Review cost: $2-3M - Approval rate: 85% (after Phase III clearance)
1.1.3 Three Limitations of Traditional Drug Discovery
Limitation 1: Enormous Time and Cost
Time Problem: - Average 12 years (Target identification β FDA approval) - Too slow to improve 5-year survival rates for cancer patients - Too slow for pandemic response (COVID-19: vaccine development 1 year)
Cost Problem: - $2.6B/drug (2020 estimate, Tufts Center survey) - Breakdown: - Preclinical: $500M - Clinical trials: $1.4B - Failure costs: $700M (amortization of past failed projects)
Economic Impact: - Rising drug prices (to recover development costs) - Stagnation in orphan drug (rare disease therapeutics) development - Dependence on generic drugs
Limitation 2: Low Success Rate
Compound Attrition Rates:
Final Success Rate: 0.00006% (1 million β 0.06)
Major Causes of Failure: 1. Insufficient Efficacy (40%): No expected effect in Phase II/III 2. Toxicity Issues (30%): Hepatotoxicity, cardiotoxicity, carcinogenicity 3. Poor PK/PD (20%): Inappropriate pharmacokinetics, low tissue accessibility 4. Commercial Reasons (10%): Market potential judgment, patent issues
Limitation 3: Insufficient Chemical Space Exploration
Vastness of Chemical Space: - Drug-like chemical space: 10^60 molecules (estimated) - Known compounds: 10^8 molecules (PubChem) - Explored: 0.0000000000000000000000000000000000000000000001%
Limitations of HTS: - Screenable: 10^6 molecules/campaign - Bias in compound libraries (biased toward easily synthesized molecules) - "Low-hanging fruit problem" (easy compounds already discovered)
Example: Kinase Inhibitors - Human kinases: 518 types - Approved drugs: 70 (2023) - Unexplored kinases: Over 85%
1.2 Three Challenges Solved by MI
The integration of Materials Informatics (MI) with AI/machine learning can overcome the limitations of traditional drug discovery.
1.2.1 Challenge 1: Efficient Exploration of Vast Chemical Space
Problems with Traditional Methods
Random Screening: - Exploration: Random or rule-based (Lipinski's Rule of Five) - Efficiency: Low (hit rate 0.01-0.1%) - Bias: Limited to synthetically accessible molecules
Chemist's Intuition: - Empirical rules: Based on past successful examples - Problem: Difficult to discover new scaffolds, strong bias - Scale: Approximately 100-1,000 compounds per year
MI Approach
Virtual Screening:
# Requirements:
# - Python 3.9+
# - pandas>=2.0.0, <2.2.0
"""
Example: Virtual Screening:
Purpose: Demonstrate data manipulation and preprocessing
Target: Intermediate
Execution time: 1-3 seconds
Dependencies: None
"""
# Example: Screen 1 billion compounds in 1 day
from rdkit import Chem
from rdkit.Chem import Descriptors
import pandas as pd
# Compound library (SMILES format)
compounds = pd.read_csv('billion_compounds.csv') # 10^9 rows
# Drug-like filter (Lipinski's Rule of Five)
def lipinski_filter(smiles):
mol = Chem.MolFromSmiles(smiles)
if mol is None:
return False
mw = Descriptors.MolWt(mol)
logp = Descriptors.MolLogP(mol)
hbd = Descriptors.NumHDonors(mol)
hba = Descriptors.NumHAcceptors(mol)
return (mw <= 500 and logp <= 5 and hbd <= 5 and hba <= 10)
# High-speed filtering with parallel processing (completes in 1 day)
filtered = compounds[compounds['smiles'].apply(lipinski_filter)]
# 10^9 β 10^7 compounds (100x reduction)
Machine Learning Prediction Models: - Training: Known active compound data (ChEMBL: 2 million compounds) - Prediction: Predict activity of 10^9 compounds in hours - Enrichment rate: Improve hit rate from 0.01% β 5-10% (500-1,000x)
Real Example: Atomwise (COVID-19 therapeutics) - Screening: 7 million compounds - Time: 1 day - Result: 2 candidate compounds (in vitro validated) - Traditional method: 6-12 months for same-scale screening
1.2.2 Challenge 2: Early Prediction of ADMET Properties
Importance of ADMET Prediction
30% of Clinical Trial Failures Are ADMET Issues: - Phase I failure: Toxicity (hERG inhibition, hepatotoxicity) - Phase II failure: Poor PK (low oral absorption, short half-life) - Cost: $100-500M per failure
Traditional Evaluation Timing: - ADMET testing: Late in lead optimization (2-3 years later) - Problem discovery: Preclinical or early clinical stages - Result: Rework, project termination
Early Prediction by MI
Computational ADMET Prediction:
# Requirements:
# - Python 3.9+
# - numpy>=1.24.0, <2.0.0
"""
Example: Computational ADMET Prediction:
Purpose: Demonstrate machine learning model training and evaluation
Target: Advanced
Execution time: 30-60 seconds
Dependencies: None
"""
# Example: Caco-2 permeability prediction (oral absorption indicator)
from rdkit import Chem
from rdkit.Chem import Descriptors
from sklearn.ensemble import RandomForestRegressor
import numpy as np
# Training data (ChEMBL: Caco-2 measured values)
X_train = ... # Molecular descriptors (ECFP, physicochemical properties)
y_train = ... # log Papp (cm/s)
# Model training
model = RandomForestRegressor(n_estimators=100)
model.fit(X_train, y_train)
# Prediction for new compound
new_smiles = "CC(C)Cc1ccc(cc1)[C@@H](C)C(=O)O" # Ibuprofen
mol = Chem.MolFromSmiles(new_smiles)
descriptors = [Descriptors.MolWt(mol), Descriptors.MolLogP(mol), ...] # 200 dimensions
predicted_papp = model.predict([descriptors])[0]
print(f"Predicted Caco-2 permeability: {10**predicted_papp:.2e} cm/s")
# Measured value: 4.2e-6 cm/s (good absorption)
Predictable ADMET Properties: 1. Absorption: - Caco-2 permeability (RΒ² = 0.75-0.85) - Oral bioavailability (RΒ² = 0.70-0.80) - P-glycoprotein substrate activity
-
Distribution: - Plasma protein binding rate (RΒ² = 0.65-0.75) - Blood-brain barrier permeability (LogBB) (RΒ² = 0.70-0.80) - Tissue distribution volume (Vd)
-
Metabolism: - CYP450 inhibition (2D6, 3A4, etc.) (ROC-AUC = 0.80-0.90) - CYP450 induction - Metabolic clearance
-
Excretion: - Renal clearance (RΒ² = 0.60-0.70) - Half-life (t1/2) (RΒ² = 0.55-0.65)
-
Toxicity: - hERG inhibition (cardiotoxicity) (ROC-AUC = 0.85-0.95) - Hepatotoxicity (ROC-AUC = 0.75-0.85) - Mutagenicity (Ames test) (ROC-AUC = 0.80-0.90)
Advantages: - Timing: Early in lead discovery (days) - Cost: $0 (computational only) vs $10K-100K/compound (experimental) - Throughput: Millions of compounds/day vs 10-100 compounds/week (experimental)
Real Example: Insilico Medicine (IPF therapeutics) - ADMET prediction: 50,000 candidate molecules - Time: 1 week - Result: Narrowed down to 100 compounds (500x reduction) - Experimental validation: 85% prediction accuracy (85 compounds actually met ADMET criteria)
1.2.3 Challenge 3: Optimization of Formulation Design
Challenges in Formulation Design
40% of Drugs Have Low Water Solubility (BCS Class II/IV): - Problem: Low absorption rate after oral administration (< 10%) - Solutions: Formulation technologies (nanoparticles, liposomes, solid dispersions) - Development period: 1-2 years - Cost: $50-100M
Difficulty of Controlled Release: - Sustained-release formulations: Maintain constant blood concentration - Challenge: Polymer selection, particle size optimization - Trial and error: Experimentally evaluate 50-200 formulations
Formulation Design by MI
Solubility Prediction:
# Example: Water solubility prediction
from rdkit import Chem
from rdkit.Chem import Descriptors
def predict_solubility(smiles):
mol = Chem.MolFromSmiles(smiles)
# Prediction formula based on Abraham descriptors
logp = Descriptors.MolLogP(mol)
mw = Descriptors.MolWt(mol)
hbd = Descriptors.NumHDonors(mol)
psa = Descriptors.TPSA(mol)
# Empirical prediction formula (RΒ² = 0.85)
logS = 0.5 - 0.01 * mw - logp + 0.5 * hbd + 0.02 * psa
return 10**logS # mol/L
smiles = "CN1C=NC2=C1C(=O)N(C(=O)N2C)C" # Caffeine
solubility = predict_solubility(smiles)
print(f"Predicted solubility: {solubility:.2f} mol/L")
# Measured value: 0.10 mol/L (21.6 mg/mL, good solubility)
Polymer Material Selection (Drug Delivery): - Training data: 500 polymer types Γ release rate data - Prediction: Optimal polymer composition for target release profile - Experimental reduction: 200 formulations β 10 formulations (20x reduction)
Real Example: AbbVie (liposome formulation) - Goal: Improve tumor accumulation of cancer therapeutics - MI use: Lipid composition optimization (predicted 500 combinations) - Experiments: Evaluated only top 20 compositions - Result: 5-fold improvement in tumor accumulation, 6-month development time reduction
1.3 Industrial Impact of AI Drug Discovery
1.3.1 Market Size and Global Trends
Rapid Growth of AI Drug Discovery Market: - 2020: $1.2B - 2024: $4.5B (estimated) - 2030: $25B (forecast, 33% CAGR) - 2035: $50B+ (forecast)
Investment Trends: - Venture capital investment (2015-2023 cumulative): $20B+ - AI investment by pharma giants: $5-10B/year - M&A activity: 50+ deals from 2020-2023
Regional Trends: 1. North America (50%): Silicon Valley + Boston (biotech clusters) 2. Europe (30%): UK, Switzerland, Germany 3. Asia (20%): China (rapid growth), Japan, South Korea
1.3.2 Reduction in Development Time and Costs
Development Time Reduction: - Traditional: 10-15 years - AI-enabled: 3-5 years (60-70% reduction) - Examples: - Exscientia (OCD drug): 12 months (traditional 4.5 years, 73% reduction) - Insilico Medicine (IPF drug): 18 months (traditional 3-5 years, 70% reduction)
Cost Reduction: - Traditional: $2.6B/drug - AI-enabled: $500M-$1B/drug (60-80% reduction) - Reduction areas: - Lead discovery: $500M β $50M (90% reduction) - Preclinical studies: $500M β $200M (60% reduction) - Lower clinical trial failure rate: $700M failure cost reduction
ROI (Return on Investment): - AI platform investment: $10-50M - Reduction effect: $500M-$1B/drug - ROI: 10-100x
1.3.3 AI Drug Discovery Startup Ecosystem
Major Players
Exscientia (UK, Oxford): - Founded: 2012 - Funding: $525M (IPO 2021, market cap $2.4B) - Pipeline: 30+ compounds (3 in Phase I/II) - Partners: Sanofi, Bayer, BMS
Insilico Medicine (Hong Kong/USA): - Founded: 2014 - Funding: $400M - Technology: Generative Chemistry (GAN), Reinforcement Learning - Pipeline: 30+ compounds, 6 starting Phase I - Partners: Pfizer, Fosun Pharma
Recursion Pharmaceuticals (USA, Salt Lake City): - Founded: 2013 - Funding: $500M (IPO 2021) - Feature: Robotic laboratory (1 million experiments/week) - Technology: Image-based phenotypic screening + AI - Pipeline: 100+ compounds
Atomwise (USA, San Francisco): - Founded: 2012 - Funding: $174M - Technology: AtomNet (Deep Convolutional NN) - Track record: 700+ projects, 50+ partners - Applications: COVID-19, Ebola, Malaria
SchrΓΆdinger (USA, New York): - Founded: 1990 (AI pivot around 2015) - Funding: $532M (IPO 2020) - Technology: Physics-based + ML - Products: Maestro, LiveDesign (computational chemistry platform)
BenevolentAI (UK, London): - Founded: 2013 - Funding: $292M - Technology: Knowledge Graph, NLP - Track record: Drug repurposing (ALS, COVID-19)
AI Drug Discovery in Japan
Major Companies: 1. Preferred Networks (PFN): - MN-166 (ibudilast, ALS treatment): Phase IIb - Matlantica platform (deep learning)
-
MOLCURE: - Small molecule drug discovery AI, 2 pipelines - Funding: $20M (2022)
-
ExaWizards: - Image analysis + AI (pathological diagnosis) - Partnerships with pharmaceutical companies
Challenges: - Investment scale: 1/10 of US (insufficient funding) - Data access: Dependence on Western databases - Talent: Shortage of hybrid AI + drug discovery talent
1.3.4 AI Strategies of Pharmaceutical Giants
Pfizer: - Investment: $1B+ (AI/ML research) - Partners: IBM Watson, Exscientia - Applications: Cancer therapeutics, COVID-19 vaccine
Roche/Genentech: - Organization: Genentech AI Lab established (2020) - Investment: $3B (5-year plan) - Technology: Foundation Models, Multi-omics integration
GSK: - Organization: AI Hub (established 2021) - Partners: Google DeepMind, Exscientia - Goal: 50% of pipeline using AI (by 2025)
Novartis: - Investment: $1B (Microsoft Azure cloud contract) - Technology: Digital twins (patient digital twins) - Applications: Clinical trial design optimization
AstraZeneca: - Investment: $800M (AI/ML) - Partners: BenevolentAI - Track record: 30+ compounds discovered using AI
1.4 History of MI in Drug Discovery
1.4.1 Early Era (1960s-1980s)
Birth of QSAR (Quantitative Structure-Activity Relationship):
- 1962: Hansch & Fujita published first QSAR equation
log(1/C) = a * logP + b * Ο + c * Es + d
- C: Biological activity concentration
- logP: Partition coefficient (lipophilicity)
- Ο: Hammett constant (electronic effect)
- Es: Steric parameter
- 1979: CoMFA (Comparative Molecular Field Analysis)
- 3D-QSAR, analysis of electrostatic and steric fields around molecules
Limitations: - Linear regression models only (cannot capture nonlinear relationships) - Limited descriptors (insufficient computational power) - Small datasets (< 100 compounds)
1.4.2 HTS Era (1990s-2000s)
Rise of High-Throughput Screening (HTS): - Early 1990s: Robotics automation - Throughput: 100,000 compounds/week - Cost: $1/compound (traditional $100)
Combinatorial Chemistry: - Technology: Parallel synthesis, solid-phase synthesis - Productivity: 1,000-10,000 compounds/week - Problem: "Diversity problem" (only similar compounds)
Early Machine Learning: - Around 1995: Neural network application to QSAR - Around 2000: SVM (Support Vector Machine), Random Forest - Dataset: ChEMBL early version (tens of thousands of compounds)
1.4.3 Deep Learning Revolution (2010s)
Impact of AlexNet (2012): - Overwhelming victory in ImageNet image classification - Application to drug discovery began (around 2013)
Major Milestones: - 2012: Merck Kaggle competition (molecular activity prediction) - Winner: George Dahl (University of Toronto) - Method: Deep Neural Network (5 layers) - Performance: 15% improvement over traditional methods
- 2015: Atomwise announced AtomNet
- Deep CNN for drug discovery
-
Discovered Ebola virus therapeutic candidates in 1 day
-
2016: Insilico Medicine, Generative Chemistry
- New molecule generation with GAN
-
Paper: Molecular Pharmaceutics
-
2018: Machine learning potential (MLP) maturation
- SchNet, MEGNet, DimeNet
- MD simulation with DFT accuracy (application to drug discovery)
1.4.4 Foundation Models Era (2020s-Present)
Drug Discovery Application of Transformer (2017): - 2019: SMILES-Transformer - Treat molecules as SMILES strings - GPT-like generative model
- 2020: ChemBERTa (HuggingFace)
- Pre-training: 10 million SMILES
- Transfer learning: High accuracy with 100 samples
AlphaFold 2 (2020): - Protein structure prediction accuracy 90%+ - Impact: Acceleration of structure-based drug discovery - Database: 200 million protein structures released
Multi-modal Models (2021-): - Integration: Molecular structure + Protein structure + Literature + Images - Examples: BioGPT, Galactica (Meta AI) - Potential: Ask "What drug works for this disease?" in natural language
Current Trends (2023-2025): 1. Active Learning: Experimental feedback loops 2. Reinforcement Learning: Multi-objective optimization 3. Explainable AI: Visualization of prediction rationale 4. Autonomous Labs: Fully automated robots + AI
1.5 Learning Objectives Check
After completing this chapter, you will be able to explain:
Basic Understanding
- β The 5 stages of drug discovery and objectives of each stage
- β Traditional drug discovery duration (12 years), cost ($2.6B), success rate (0.01%)
- β Three bottlenecks in drug discovery (time, cost, success rate)
Role of MI
- β Efficient chemical space exploration with Virtual Screening (500-1,000x enrichment)
- β Early ADMET prediction reducing failures by 30%
- β Formulation design optimization shortening development by 6 months
Industry Trends
- β AI drug discovery market size (2024 $4.5B β 2030 $25B)
- β Track record of 6 major startups (Exscientia, Insilico, etc.)
- β AI investment by pharma giants ($1-3B scale)
Historical Context
- β Evolution from QSAR (1960s) to Deep Learning (2010s)
- β Revolutionary impact of AlphaFold 2 (2020)
- β Potential of Foundation Models (2020s)
Practice Problems
Easy (Basic Check)
Q1: At which stage do the most candidate compounds drop out in the traditional drug discovery process? a) Target identification b) Lead discovery c) Lead optimization d) Clinical trials Phase II
View Answer
**Correct Answer**: b) Lead discovery **Explanation**: In HTS, 1 million compounds are screened to get 1,000 hits (99.9% drop out), and then narrowed down to about 50 lead compounds (95% drop out). The highest attrition rate is at the lead discovery stage.Q2: What property does the "T" in ADMET indicate?
View Answer
**Correct Answer**: Toxicity **ADMET**: - **A**bsorption - **D**istribution - **M**etabolism - **E**xcretion - **T**oxicityMedium (Application)
Q3: Insilico Medicine reached Phase I for IPF therapeutics in 18 months. How many years would traditional methods take, and what percentage time reduction does this represent?
View Answer
**Correct Answer**: Traditional 3-5 years β 18 months (1.5 years), 70-85% reduction **Calculation**: - Reduction rate = (Traditional years - AI years) / Traditional years Γ 100 - 3-year basis: (3 - 1.5) / 3 = 50% β but actually 70% up to preclinical - 5-year basis: (5 - 1.5) / 5 = 70%Q4: The AI drug discovery market is forecast to grow from $4.5B in 2024 to $25B in 2030. Calculate the CAGR (Compound Annual Growth Rate) for this period.
View Answer
**Correct Answer**: Approximately 33% **Calculation**: CAGR = (Ending value/Starting value)^(1/years) - 1 = (25/4.5)^(1/6) - 1 = 5.56^0.167 - 1 = 1.33 - 1 = 0.33 = 33%Hard (Advanced)
Q5: Given chemical space contains 10^60 molecules and HTS can screen 10^6 molecules/campaign, how many years would it take to explore the entire chemical space? (Assume 1 campaign = 1 month)
View Answer
**Correct Answer**: Approximately 8.3 Γ 10^46 years (over 10^36 times the age of the universe) **Calculation**: - Required campaigns = 10^60 / 10^6 = 10^54 - Years = 10^54 months / 12 = 8.3 Γ 10^52 years **Lesson**: Exhaustive search is impossible. Efficient exploration by AI/ML is essential.Q6: For Exscientia's OCD drug development, lead discovery took 12 months. Compared to traditional 4.5 years, estimate the cost reduction. (Assume lead discovery phase cost is $500M)
View Answer
**Estimated Cost Reduction**: Over $370M **Calculation**: - Traditional cost: $500M / 4.5 years = $111M/year - AI-enabled: $111M Γ 1 year = $111M - Reduction: $500M - $111M = $389M Even after deducting AI platform costs ($10-50M), still $300M+ reduction.Next Steps
In Chapter 1, you understood the drug discovery process and the role of MI. In the next Chapter 2, you will learn detailed MI methods specialized for drug discovery (QSAR, ADMET prediction, molecular generation models).
Chapter 2: MI Methods Specialized for Drug Discovery β
References
-
Mak, K. K., & Pichika, M. R. (2019). "Artificial intelligence in drug development: present status and future prospects." Drug Discovery Today, 24(3), 773-780.
-
Zhavoronkov, A., et al. (2019). "Deep learning enables rapid identification of potent DDR1 kinase inhibitors." Nature Biotechnology, 37(9), 1038-1040.
-
Paul, S. M., et al. (2010). "How to improve R&D productivity: the pharmaceutical industry's grand challenge." Nature Reviews Drug Discovery, 9(3), 203-214.
-
Jumper, J., et al. (2021). "Highly accurate protein structure prediction with AlphaFold." Nature, 596(7873), 583-589.
-
Vamathevan, J., et al. (2019). "Applications of machine learning in drug discovery and development." Nature Reviews Drug Discovery, 18(6), 463-477.