Least-to-Most Prompting: Easy-to-Hard Generalization
Break complex problems into simpler subproblems, solve the easiest first, and feed answers forward. Achieves 99% accuracy on tasks where Chain-of-Thought gets 16%.
The Core Idea
Least-to-Most prompting (Zhou et al. 2022, ICLR 2023) solves the hardest limitation of few-shot prompting: standard CoT can't solve problems harder than the examples it was shown. Least-to-Most breaks hard problems into a sequence of easier subproblems. By solving the easiest first and feeding answers forward, it generalizes to problems far beyond the exemplars.
Standard Few-Shot CoT: See easy examples → Can only solve easy problems
Least-to-Most: See easy examples → Decompose hard problem → Solve in sequence
→ Can solve MUCH harder problems
The SCAN Breakthrough
SCAN is a compositional generalization benchmark: given commands like "jump twice" and "walk left," can the model understand "jump twice after walking left twice"?
| Method | SCAN Accuracy (length split) | Training Data |
|---|---|---|
| Standard CoT (few-shot) | 16% | 14 examples |
| Least-to-Most | 99% | 14 examples |
| Neural-symbolic models (specialized) | ~99% | 15,000+ examples |
Least-to-Most matched specialized models trained on 1,000x more data, using only 14 few-shot examples. That's the power of decomposition.
How Least-to-Most Works
Two distinct phases:
Phase 1: Decomposition
The prompt teaches the model how to break problems down. You provide few-shot examples showing a hard problem and its decomposition into easier subproblems.
Example:
Complex problem: "walk left twice and jump twice after walk"
Decomposition:
1. "walk left twice"
2. "walk left twice and jump twice after walk" (combining)
Simple problem: "jump twice after walk left twice"
Decomposition:
1. "walk left twice"
2. "walk left twice and jump twice" (combining)
Phase 2: Subproblem Solving
Each subproblem is solved in order. The answer to subproblem N is fed into the prompt for subproblem N+1.
Problem: "jump twice and walk left after walk right"
Decomposition:
1. "jump twice" → Easy
2. "walk left after walk right" → Medium
3. "jump twice and walk left after walk right" → Full (using answers from 1 and 2)
The model never faces the hard problem directly. It climbs the difficulty ladder one rung at a time.
Prompt Structure
Decomposition Prompt (Few-Shot)
You will break complex problems into simpler subproblems.
Example:
Complex: "look opposite right twice and walk opposite left"
Decomposition:
1. "look opposite right twice"
2. "look opposite right twice and walk opposite left"
Complex: "run left twice after run right"
Decomposition:
1. "run left twice"
2. "run left twice after run right" (keeping original structure)
Complex: "look twice after run"
Decomposition:
1. "look twice"
2. "look twice after run" (addressing the harder part after)
Now decompose this:
Complex: "{problem}"
Decomposition:
Solving Prompt (Sequential)
Now solve each subproblem one at a time. Use the solution of the
previous subproblem to help solve the next one.
Step 1: {subproblem_1}
Solution 1: {solution_1}
Step 2: {subproblem_2}, using Solution 1
Solution 2: {solution_2}
Step 3: {subproblem_3}, using Solutions 1 and 2
Solution 3: {solution_3}
Implementation
def least_to_most(llm, problem: str, decomposition_examples: list,
solve_examples: list):
"""Least-to-Most: decompose, then solve sequentially."""
# Phase 1: Decompose into subproblems
decomp_prompt = build_decomposition_prompt(
decomposition_examples, problem
)
decomposition = llm.generate(decomp_prompt)
subproblems = parse_subproblems(decomposition) # ["sub1", "sub2", ...]
# Phase 2: Solve each subproblem, feeding answers forward
solutions = []
for i, subproblem in enumerate(subproblems):
solve_prompt = build_solve_prompt(
solve_examples, subproblems, solutions, current_idx=i
)
answer = llm.generate(solve_prompt)
solutions.append(answer)
return {
"decomposition": subproblems,
"solutions": solutions,
"final_answer": solutions[-1]
}
def parse_subproblems(decomposition_text: str) -> list:
"""Extract numbered subproblems from decomposition output."""
import re
matches = re.findall(r'\d+\.\s+(.+)', decomposition_text)
return matches if matches else [decomposition_text]
def build_decomposition_prompt(examples: list, problem: str) -> str:
"""Build the few-shot decomposition prompt (see Prompt Structure above)."""
prompt = "You will break complex problems into simpler subproblems.\n\n"
for ex in examples:
prompt += f"Complex: \"{ex['complex']}\"\n"
prompt += "Decomposition:\n"
for i, step in enumerate(ex['decomposition'], 1):
prompt += f"{i}. \"{step}\"\n"
prompt += "\n"
prompt += f"Now decompose this:\nComplex: \"{problem}\"\nDecomposition:\n"
return prompt
def build_solve_prompt(examples: list, subproblems: list,
solutions: list, current_idx: int) -> str:
"""Build the sequential solving prompt, feeding prior answers forward."""
prompt = "Now solve each subproblem one at a time.\n\n"
for i in range(current_idx):
prompt += f"Step {i+1}: {subproblems[i]}\nSolution {i+1}: {solutions[i]}\n\n"
prompt += f"Step {current_idx+1}: {subproblems[current_idx]}"
if current_idx > 0:
prompt += f", using Solutions 1-{current_idx}"
prompt += "\nSolution:"
return prompt
When Least-to-Most Wins
Strongest on:
- Compositional generalization (SCAN, COGS, CFQ)
- Symbolic manipulation (last-letter concatenation, coin flip tracking)
- Multi-step math with increasing difficulty
- Problems where subproblems have clear dependency chains
No advantage on:
- Single-step problems — decomposition is overhead with no benefit
- Problems where subproblems are interdependent (circular dependencies)
- Tasks where the "hard" part can't be separated from the "easy" parts
- Creative tasks where decomposition destroys coherence
When Decomposition Fails
Not every problem decomposes cleanly:
| Failure Mode | Example | Why It Fails |
|---|---|---|
| Circular dependencies | Subproblem A needs B's answer, B needs A's answer | No valid solving order |
| Non-compositional reasoning | "Is this argument valid?" — can't separate into independent steps | Reasoning is holistic |
| Over-decomposition | Breaking "write a haiku" into "write line 1," "write line 2," "write line 3" | Context needed across subproblems |
| Lost context | Each subproblem's answer loses nuance needed by later subproblems | Information compression |
Detecting Decomposition Failures
# pseudo-code: check for circular dependencies via topological sort.
# In practice, use a library like networkx or a simple DFS cycle detector.
def is_decomposition_valid(subproblems: list, dependencies: dict) -> bool:
"""Check for circular dependencies via topological sort."""
from collections import deque
in_degree = {s: 0 for s in subproblems}
for s in subproblems:
for dep in dependencies.get(s, []):
in_degree[s] += 1
queue = deque([s for s in subproblems if in_degree[s] == 0])
visited = 0
while queue:
node = queue.popleft()
visited += 1
for s in subproblems:
if node in dependencies.get(s, []):
in_degree[s] -= 1
if in_degree[s] == 0:
queue.append(s)
return visited == len(subproblems)
In practice, if the model's decomposition output is nonsensical or the subproblems can't be ordered, fall back to standard CoT.
Least-to-Most vs. Other Strategies
| Technique | Decomposition | Order | Exemplar Requirement | Generalization |
|---|---|---|---|---|
| Standard CoT | None (linear) | Sequential | Few-shot | Up to exemplar difficulty |
| Plan-and-Solve | Top-down plan | Plan-guided | Zero-shot | Same complexity |
| Least-to-Most | Bottom-up | Easiest first | Few-shot (decomp. examples) | Harder than exemplars |
| Tree-of-Thought | Branching | BFS/DFS with evaluation | Few-shot | Same complexity |
| Decomposed Prompting | Modular sub-prompts | Arbitrary | Few-shot | Broader coverage |
The Difficulty Ladder
Least-to-Most works by constructing a difficulty ladder. Each step should be only slightly harder than the previous:
Subproblem 1: One operation (difficulty: 1/10)
Subproblem 2: Two operations (difficulty: 3/10)
Subproblem 3: Three operations (difficulty: 6/10)
Subproblem 4: Full problem (difficulty: 9/10)
If the jump between subproblems is too large, the model fails at the transition. The decomposition prompt should enforce gradual difficulty increase.
Cost Analysis
| Phase | Calls | Cost |
|---|---|---|
| Decomposition | 1 | 1x |
| Solving (N subproblems) | N | N × 1x |
| Total | N + 1 | (N + 1)x |
For a problem decomposed into 3 subproblems: 4 total LLM calls (1 decomposition + 3 solving). This is more expensive than single-call CoT, but the accuracy gain on hard problems makes it worthwhile.
When to pay the cost:
- The problem is harder than your few-shot examples
- Standard CoT gets it wrong reliably
- Wrong answers have high cost (medical, financial, legal)
- You can batch the decomposition phase across similar problems
Real-World Use Cases
Beyond the SCAN benchmark, Least-to-Most applies to:
- Multi-hop QA: "Who is the CEO of the company that acquired Instagram?" → Decompose: "What company acquired Instagram?" → "Who is the CEO of that company?"
- Code generation: "Write a function that sorts a list, removes duplicates, and returns the top 3" → Decompose into sort, deduplicate, top-k
- Data pipeline design: Decompose complex ETL into sequential transformation steps
- Legal reasoning: Break a statute interpretation into definition lookup → condition check → application
Related Articles
Mastering Character Portraits in Midjourney: Techniques, Styles, and Prompts
Create stunning character portraits with Midjourney using advanced prompts, lighting techniques, and artistic parameters. Explore photorealistic, artistic, fantasy, and vintage portrait styles.
DeepSeek Math & STEM Reasoning: Proofs, Calculations & Science
Leverage DeepSeek's strongest domain — math and STEM. Reasoning mode with effort=max for proofs, calculations, scientific analysis, and engineering problems. Verification patterns using visible reasoning tokens.
ChatGPT Resume Writing Guide: Create Professional Resumes
Learn how to leverage ChatGPT to craft compelling, ATS-optimized resumes that highlight your skills and achievements. Includes templates, prompts, and expert tips.