Chapter 3: Advanced Prompt Techniques

Tree of Thought, Self-Consistency, ReAct, and Reasoning Model Prompts

Reading Time: 25-30 minutes Prompt Examples: 12 Exercises: 5 Difficulty: Intermediate-Advanced
Language: English | Japanese

Introduction

Building on the basics from Chapters 1 and 2, this chapter covers advanced prompting techniques that significantly improve LLM performance on complex reasoning tasks. These techniques represent the cutting edge of prompt engineering as of 2025-2026.

By the end of this chapter, you will be able to:

Tree of Thought (ToT)

Tree of Thought prompting encourages the LLM to explore multiple reasoning paths simultaneously, evaluate each path, and select the most promising direction. This is particularly effective for problems requiring planning, search, or creative exploration.

How Tree of Thought Works

graph TD A[Problem] --> B1[Thought 1] A --> B2[Thought 2] A --> B3[Thought 3] B1 --> C1[Evaluation: 70%] B2 --> C2[Evaluation: 85%] B3 --> C3[Evaluation: 40%] C2 --> D1[Continue Path 2] D1 --> E1[Thought 2.1] D1 --> E2[Thought 2.2] E1 --> F1[Final Answer] style C2 fill:#4caf50 style F1 fill:#2196f3

Basic ToT Prompt

Tree of Thought Example

Solve this problem using Tree of Thought reasoning.

**Problem**: A farmer needs to cross a river with a wolf, a goat, and a cabbage.
The boat can only carry the farmer and one item. If left alone:
- The wolf will eat the goat
- The goat will eat the cabbage

**Instructions**:
1. Generate 3 different initial moves
2. Evaluate each move (safe/unsafe and why)
3. Select the best move and continue
4. Repeat until solved

**Format for each step**:
```
Move Option [N]:
- Action: [what to take across]
- State after move: [who/what is where]
- Evaluation: [Safe/Unsafe] - [reason]
- Score: [1-10]
```

Begin your tree exploration:

ToT for Creative Tasks

Creative Tree of Thought

Use Tree of Thought to generate a company name for an AI-powered legal assistant startup.

**Step 1: Generate 3 thematic directions**
- Direction A: [Professional/Traditional]
- Direction B: [Tech-forward/Modern]
- Direction C: [Approachable/Friendly]

**Step 2: For each direction, generate 3 name candidates**
Evaluate each on:
- Memorability (1-10)
- Domain availability likelihood (1-10)
- Legal industry fit (1-10)
- Global appeal (1-10)

**Step 3: Select top 3 names across all directions**
Provide final ranking with justification.

**Step 4: For the top choice, generate:**
- Tagline options (3)
- Visual/logo concepts (brief descriptions)
- Potential concerns or trademark issues

When to Use Tree of Thought

Good for ToT Not ideal for ToT
Strategic planning problems Simple factual questions
Creative brainstorming Straightforward calculations
Multi-step puzzles Yes/no decisions
Decision-making with tradeoffs Time-sensitive responses

Self-Consistency

Self-Consistency improves reliability by generating multiple independent reasoning paths and selecting the most common answer. This reduces the impact of individual reasoning errors.

Basic Self-Consistency

Self-Consistency Prompt

Solve this problem 5 times using different reasoning approaches.
Then report the most consistent answer.

**Problem**: A store offers a 20% discount on all items. An additional 10%
is taken off at checkout. What is the total percentage saved on a $100 item?

**Approach 1**: Sequential percentage calculation
[Your reasoning...]
Answer 1: ___

**Approach 2**: Work backwards from final price
[Your reasoning...]
Answer 2: ___

**Approach 3**: Use algebraic formula
[Your reasoning...]
Answer 3: ___

**Approach 4**: Concrete example with different starting price
[Your reasoning...]
Answer 4: ___

**Approach 5**: Verify using a different method
[Your reasoning...]
Answer 5: ___

**Final Answer**: [Most common answer]
**Confidence**: [Based on consistency]
**Note any discrepancies**: [If answers differ, explain why]

Self-Consistency via API

For programmatic self-consistency, sample multiple responses with temperature > 0:

Python Implementation

from openai import OpenAI
from collections import Counter

client = OpenAI()

def self_consistent_answer(prompt, n_samples=5, temperature=0.7):
    """Generate multiple answers and return the most common one."""

    responses = []
    for _ in range(n_samples):
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[{"role": "user", "content": prompt}],
            temperature=temperature,
            max_tokens=500
        )
        # Extract the final answer (assume it's on the last line)
        answer = response.choices[0].message.content.strip().split('\n')[-1]
        responses.append(answer)

    # Count occurrences and return most common
    answer_counts = Counter(responses)
    most_common = answer_counts.most_common(1)[0]

    return {
        "answer": most_common[0],
        "confidence": most_common[1] / n_samples,
        "all_responses": responses
    }

# Usage
result = self_consistent_answer(
    "What is 15% of 80? Show your work, then state just the number on the last line."
)
print(f"Answer: {result['answer']}")
print(f"Confidence: {result['confidence']:.0%}")

ReAct Pattern

ReAct (Reasoning + Acting) is a prompting pattern where the LLM alternates between reasoning about a problem and taking actions (like searching for information or using tools). This is foundational for AI agents.

ReAct Structure

graph LR A[Question] --> B[Thought] B --> C[Action] C --> D[Observation] D --> E{Done?} E -->|No| B E -->|Yes| F[Answer] style B fill:#fff3e0 style C fill:#e3f2fd style D fill:#e8f5e9

ReAct Prompt Template

ReAct Pattern

Answer the following question using the ReAct format.
You have access to these actions:
- Search[query]: Search Wikipedia for information
- Calculate[expression]: Perform mathematical calculations
- Lookup[term]: Look up a specific term in the current context

**Question**: What is the population density of the country that hosted the 2024 Olympics?

**Format**:
Thought 1: [Your reasoning about what to do first]
Action 1: [Action name][input]
Observation 1: [What you learned - I'll provide this]

Thought 2: [Your next reasoning step]
Action 2: [Next action]
Observation 2: [Result]

... continue until you have the answer ...

Final Answer: [Your answer based on the observations]

Begin:

ReAct Example Execution

ReAct in Practice

Question: Who won more Grand Slam titles - Roger Federer or Rafael Nadal?
And by how many?

Thought 1: I need to find out how many Grand Slam titles each player has won.
Let me start with Roger Federer.
Action 1: Search[Roger Federer Grand Slam titles]
Observation 1: Roger Federer won 20 Grand Slam singles titles.

Thought 2: Now I need to find Nadal's Grand Slam count.
Action 2: Search[Rafael Nadal Grand Slam titles]
Observation 2: Rafael Nadal won 22 Grand Slam singles titles.

Thought 3: Now I can compare and calculate the difference.
Action 3: Calculate[22 - 20]
Observation 3: 2

Thought 4: I have all the information needed to answer the question.
Final Answer: Rafael Nadal won more Grand Slam titles than Roger Federer.
Nadal won 22 titles compared to Federer's 20, a difference of 2 titles.

ReAct for Code Debugging

ReAct Debugging Pattern

Debug this code using the ReAct pattern.

**Available actions**:
- ReadCode[section]: Examine a specific part of the code
- Trace[input]: Trace execution with a specific input
- Hypothesis[bug]: State a hypothesis about the bug
- Fix[change]: Propose a specific fix

**Code**:
```python
def find_duplicates(lst):
    seen = set()
    duplicates = []
    for item in lst:
        if item in seen:
            duplicates.add(item)
        seen.add(item)
    return duplicates
```

**Bug report**: Function crashes with "AttributeError: 'list' object has no attribute 'add'"

Begin ReAct debugging:

Prompting for Reasoning Models (o1/o3)

OpenAI's o1 and o3 models (and similar reasoning models like DeepSeek-R1) use internal "thinking" tokens before responding. These models require different prompting strategies than standard LLMs.

Key Differences from Standard LLMs

Standard LLMs (GPT-4, Claude) Reasoning Models (o1, o3, DeepSeek-R1)
Benefit from detailed instructions Prefer concise, clear objectives
Need explicit CoT prompting Perform CoT internally (automatic)
Few-shot examples improve output Few-shot can actually hurt performance
Temperature affects creativity Temperature is typically fixed at 1
System prompts guide behavior System prompts have limited effect

Prompting Best Practices for o1/o3

Guidelines for Reasoning Models

  1. Be concise: State the problem clearly without over-explaining
  2. Don't say "think step by step": They do this automatically
  3. Avoid few-shot examples: These can constrain their reasoning
  4. Use delimiters: Clearly separate different parts of the input
  5. Specify output format at the end: "Provide your final answer as..."

Bad Prompt for o1/o3

You are an expert mathematician. Please think step by step and show all your work.

Here's an example of how to solve a similar problem:
[Long example...]

Now solve this problem carefully, making sure to consider all cases:
What is the probability that a random 5-card poker hand contains exactly one pair?

Think about this systematically:
1. First calculate total combinations
2. Then calculate favorable outcomes
3. Finally divide to get probability

Problems: System prompt, CoT instruction, few-shot example, over-specification

Good Prompt for o1/o3

Calculate the probability that a random 5-card poker hand contains exactly one pair
(two cards of one rank, three cards of different ranks).

Provide your final answer as a simplified fraction and decimal (4 decimal places).

Why it works: Concise problem statement, clear output format, lets model reason freely

When to Use Reasoning Models

Ideal Use Cases for o1/o3

Suboptimal Use Cases

Effective o1/o3 Prompt Examples

Coding Problem

Implement a function that finds the longest increasing subsequence in an array.
Requirements:
- Time complexity: O(n log n)
- Return both the length and the actual subsequence
- Handle edge cases: empty array, single element, all same values

Language: Python
Output: Complete function with docstring

Analysis Problem

A company's revenue data:
Q1: $1.2M, Q2: $1.5M, Q3: $1.1M, Q4: $1.8M
Marketing spend: Q1: $200K, Q2: $300K, Q3: $150K, Q4: $400K
Customer acquisition: Q1: 500, Q2: 700, Q3: 400, Q4: 900

Analyze the relationship between marketing spend and revenue.
Identify any anomalies and suggest optimal marketing allocation.

Output: Key findings (3-5 bullet points) and recommendation.

Multi-Turn Conversation Design

Designing effective multi-turn conversations involves planning how context builds across messages and managing conversation state.

Conversation Flow Patterns

1. Funnel Pattern

Start broad, progressively narrow focus:

Funnel Conversation

**Turn 1 (User)**: I need help with my Python project
**Turn 1 (AI)**: I'd be happy to help! Could you tell me:
- What type of project (web app, data analysis, automation)?
- What specific challenge are you facing?

**Turn 2 (User)**: It's a web scraping project. I'm having trouble with dynamic content.
**Turn 2 (AI)**: Dynamic content usually requires JavaScript rendering.
- What website are you scraping?
- Have you tried Selenium or Playwright?
- What's your current approach?

**Turn 3 (User)**: Using BeautifulSoup. The content loads via AJAX.
**Turn 3 (AI)**: BeautifulSoup can't handle JavaScript. Here are your options:
[Specific solutions for AJAX content...]

2. Iterative Refinement Pattern

Build and improve across turns:

Iterative Refinement

**Turn 1**: Write a function to validate email addresses
**Turn 1 Response**: [Basic regex validation]

**Turn 2**: Add support for plus addressing (user+tag@domain.com)
**Turn 2 Response**: [Updated function]

**Turn 3**: Also check if the domain has valid MX records
**Turn 3 Response**: [Added DNS lookup]

**Turn 4**: Make it async and add rate limiting
**Turn 4 Response**: [Final production-ready version]

Context Management

Explicit Context Summarization

**Before a complex multi-turn task**:

I'll be helping you build a REST API. Let's establish our context:

**Project Summary** (I'll update this as we go):
- Framework: [To be determined]
- Database: [To be determined]
- Auth method: [To be determined]
- Endpoints: [To be listed]

After each major decision, I'll update this summary.
Please confirm or correct if I misunderstand anything.

What framework would you like to use?

Prompt Chains

Prompt chains connect multiple prompts where the output of one becomes the input to another. This enables complex workflows that exceed single-prompt capabilities.

Sequential Chain

graph LR A[Input] --> B[Prompt 1: Extract] B --> C[Prompt 2: Analyze] C --> D[Prompt 3: Format] D --> E[Output] style A fill:#e3f2fd style E fill:#e8f5e9

Document Analysis Chain

def analyze_document(document):
    # Step 1: Extract key information
    extraction_prompt = f"""
    Extract from this document:
    - Main topics (list)
    - Key entities (people, organizations, places)
    - Dates and numbers mentioned
    - Action items if any

    Document:
    {document}

    Return as JSON.
    """
    extracted = call_llm(extraction_prompt)

    # Step 2: Analyze sentiment and tone
    analysis_prompt = f"""
    Based on this extracted information:
    {extracted}

    Analyze:
    - Overall sentiment (positive/negative/neutral)
    - Tone (formal/informal/urgent/routine)
    - Key concerns or priorities
    - Implicit assumptions

    Return as JSON.
    """
    analysis = call_llm(analysis_prompt)

    # Step 3: Generate executive summary
    summary_prompt = f"""
    Create an executive summary combining:

    Extracted Info: {extracted}
    Analysis: {analysis}

    Format:
    ## Executive Summary
    [2-3 sentence overview]

    ## Key Points
    - [Bullet points]

    ## Recommended Actions
    1. [Numbered list]
    """
    summary = call_llm(summary_prompt)

    return summary

Branching Chain

Conditional Branching

def handle_customer_inquiry(inquiry):
    # Step 1: Classify the inquiry
    classification_prompt = f"""
    Classify this customer inquiry into ONE category:
    - BILLING: Payment, invoices, refunds
    - TECHNICAL: Product issues, bugs, how-to
    - SALES: Pricing, features, upgrades
    - COMPLAINT: Negative feedback, escalations

    Inquiry: {inquiry}

    Return only the category name.
    """
    category = call_llm(classification_prompt).strip()

    # Step 2: Branch based on classification
    if category == "BILLING":
        response = handle_billing(inquiry)
    elif category == "TECHNICAL":
        response = handle_technical(inquiry)
    elif category == "SALES":
        response = handle_sales(inquiry)
    else:  # COMPLAINT
        response = handle_complaint(inquiry)

    # Step 3: Format final response
    formatting_prompt = f"""
    Polish this customer service response:
    {response}

    Ensure:
    - Professional but warm tone
    - Clear next steps
    - Proper greeting and closing
    """
    return call_llm(formatting_prompt)

Exercises

Exercise 1: Tree of Thought (Difficulty: Medium)

Task: Use Tree of Thought to plan a 3-day trip to Tokyo with these constraints:

Generate 3 different itinerary approaches, evaluate each, and select the best.

Exercise 2: Self-Consistency (Difficulty: Medium)

Task: Design a self-consistency prompt to determine:

"Is it ethical for companies to use AI to screen job applicants?"

Include 5 different ethical framework perspectives (utilitarian, deontological, etc.)

Exercise 3: ReAct Pattern (Difficulty: Medium)

Task: Create a ReAct prompt for a "Research Assistant" that can:

Test it with the topic: "Latest advances in protein folding prediction"

Exercise 4: Reasoning Model Prompt (Difficulty: Advanced)

Task: Rewrite this prompt for o1/o3:

You are an expert software architect. Please think carefully
step by step about this problem. Here's an example of good
architecture... [long example]

Design a scalable microservices architecture for an e-commerce
platform handling 10M daily users.

Exercise 5: Prompt Chain (Difficulty: Advanced)

Task: Design a 4-step prompt chain for "Code Review Automation":

  1. Parse and understand code structure
  2. Identify potential issues (bugs, security, performance)
  3. Prioritize issues by severity
  4. Generate actionable review comments

Chapter Summary

Key Points

Next Steps

In Chapter 4, we'll explore Function Calling and Tool Use:


References


Update History

Disclaimer