Least-to-Most Prompting: Easy-to-Hard Generalization

Break complex problems into simpler subproblems, solve the easiest first, and feed answers forward. Achieves 99% accuracy on tasks where Chain-of-Thought gets 16%.

June 11, 2026
least-to-mostdecompositionreasoningcompositional-generalizationchain-of-thoughtprompt-engineering

The Core Idea

Least-to-Most prompting (Zhou et al. 2022, ICLR 2023) solves the hardest limitation of few-shot prompting: standard CoT can't solve problems harder than the examples it was shown. Least-to-Most breaks hard problems into a sequence of easier subproblems. By solving the easiest first and feeding answers forward, it generalizes to problems far beyond the exemplars.

Standard Few-Shot CoT:  See easy examples → Can only solve easy problems
Least-to-Most:          See easy examples → Decompose hard problem → Solve in sequence
                        → Can solve MUCH harder problems

The SCAN Breakthrough

SCAN is a compositional generalization benchmark: given commands like "jump twice" and "walk left," can the model understand "jump twice after walking left twice"?

MethodSCAN Accuracy (length split)Training Data
Standard CoT (few-shot)16%14 examples
Least-to-Most99%14 examples
Neural-symbolic models (specialized)~99%15,000+ examples

Least-to-Most matched specialized models trained on 1,000x more data, using only 14 few-shot examples. That's the power of decomposition.

How Least-to-Most Works

Two distinct phases:

Phase 1: Decomposition

The prompt teaches the model how to break problems down. You provide few-shot examples showing a hard problem and its decomposition into easier subproblems.

Example:
Complex problem: "walk left twice and jump twice after walk"
Decomposition:
  1. "walk left twice"
  2. "walk left twice and jump twice after walk"  (combining)

Simple problem: "jump twice after walk left twice"
Decomposition:
  1. "walk left twice"
  2. "walk left twice and jump twice"  (combining)

Phase 2: Subproblem Solving

Each subproblem is solved in order. The answer to subproblem N is fed into the prompt for subproblem N+1.

Problem: "jump twice and walk left after walk right"

Decomposition:
  1. "jump twice"                          → Easy
  2. "walk left after walk right"          → Medium
  3. "jump twice and walk left after walk right" → Full (using answers from 1 and 2)

The model never faces the hard problem directly. It climbs the difficulty ladder one rung at a time.

Prompt Structure

Decomposition Prompt (Few-Shot)

You will break complex problems into simpler subproblems.

Example:
Complex: "look opposite right twice and walk opposite left"
Decomposition:
1. "look opposite right twice"
2. "look opposite right twice and walk opposite left"


Complex: "run left twice after run right"
Decomposition:
1. "run left twice"
2. "run left twice after run right" (keeping original structure)


Complex: "look twice after run"
Decomposition:
1. "look twice"
2. "look twice after run" (addressing the harder part after)


Now decompose this:
Complex: "{problem}"
Decomposition:

Solving Prompt (Sequential)

Now solve each subproblem one at a time. Use the solution of the
previous subproblem to help solve the next one.

Step 1: {subproblem_1}
Solution 1: {solution_1}

Step 2: {subproblem_2}, using Solution 1
Solution 2: {solution_2}

Step 3: {subproblem_3}, using Solutions 1 and 2
Solution 3: {solution_3}

Implementation

def least_to_most(llm, problem: str, decomposition_examples: list,
                  solve_examples: list):
    """Least-to-Most: decompose, then solve sequentially."""

    # Phase 1: Decompose into subproblems
    decomp_prompt = build_decomposition_prompt(
        decomposition_examples, problem
    )
    decomposition = llm.generate(decomp_prompt)
    subproblems = parse_subproblems(decomposition)  # ["sub1", "sub2", ...]

    # Phase 2: Solve each subproblem, feeding answers forward
    solutions = []
    for i, subproblem in enumerate(subproblems):
        solve_prompt = build_solve_prompt(
            solve_examples, subproblems, solutions, current_idx=i
        )
        answer = llm.generate(solve_prompt)
        solutions.append(answer)

    return {
        "decomposition": subproblems,
        "solutions": solutions,
        "final_answer": solutions[-1]
    }


def parse_subproblems(decomposition_text: str) -> list:
    """Extract numbered subproblems from decomposition output."""
    import re
    matches = re.findall(r'\d+\.\s+(.+)', decomposition_text)
    return matches if matches else [decomposition_text]


def build_decomposition_prompt(examples: list, problem: str) -> str:
    """Build the few-shot decomposition prompt (see Prompt Structure above)."""
    prompt = "You will break complex problems into simpler subproblems.\n\n"
    for ex in examples:
        prompt += f"Complex: \"{ex['complex']}\"\n"
        prompt += "Decomposition:\n"
        for i, step in enumerate(ex['decomposition'], 1):
            prompt += f"{i}. \"{step}\"\n"
        prompt += "\n"
    prompt += f"Now decompose this:\nComplex: \"{problem}\"\nDecomposition:\n"
    return prompt


def build_solve_prompt(examples: list, subproblems: list,
                       solutions: list, current_idx: int) -> str:
    """Build the sequential solving prompt, feeding prior answers forward."""
    prompt = "Now solve each subproblem one at a time.\n\n"
    for i in range(current_idx):
        prompt += f"Step {i+1}: {subproblems[i]}\nSolution {i+1}: {solutions[i]}\n\n"
    prompt += f"Step {current_idx+1}: {subproblems[current_idx]}"
    if current_idx > 0:
        prompt += f", using Solutions 1-{current_idx}"
    prompt += "\nSolution:"
    return prompt

When Least-to-Most Wins

Strongest on:

  • Compositional generalization (SCAN, COGS, CFQ)
  • Symbolic manipulation (last-letter concatenation, coin flip tracking)
  • Multi-step math with increasing difficulty
  • Problems where subproblems have clear dependency chains

No advantage on:

  • Single-step problems — decomposition is overhead with no benefit
  • Problems where subproblems are interdependent (circular dependencies)
  • Tasks where the "hard" part can't be separated from the "easy" parts
  • Creative tasks where decomposition destroys coherence

When Decomposition Fails

Not every problem decomposes cleanly:

Failure ModeExampleWhy It Fails
Circular dependenciesSubproblem A needs B's answer, B needs A's answerNo valid solving order
Non-compositional reasoning"Is this argument valid?" — can't separate into independent stepsReasoning is holistic
Over-decompositionBreaking "write a haiku" into "write line 1," "write line 2," "write line 3"Context needed across subproblems
Lost contextEach subproblem's answer loses nuance needed by later subproblemsInformation compression

Detecting Decomposition Failures

# pseudo-code: check for circular dependencies via topological sort.
# In practice, use a library like networkx or a simple DFS cycle detector.
def is_decomposition_valid(subproblems: list, dependencies: dict) -> bool:
    """Check for circular dependencies via topological sort."""
    from collections import deque
    in_degree = {s: 0 for s in subproblems}
    for s in subproblems:
        for dep in dependencies.get(s, []):
            in_degree[s] += 1
    queue = deque([s for s in subproblems if in_degree[s] == 0])
    visited = 0
    while queue:
        node = queue.popleft()
        visited += 1
        for s in subproblems:
            if node in dependencies.get(s, []):
                in_degree[s] -= 1
                if in_degree[s] == 0:
                    queue.append(s)
    return visited == len(subproblems)

In practice, if the model's decomposition output is nonsensical or the subproblems can't be ordered, fall back to standard CoT.

Least-to-Most vs. Other Strategies

TechniqueDecompositionOrderExemplar RequirementGeneralization
Standard CoTNone (linear)SequentialFew-shotUp to exemplar difficulty
Plan-and-SolveTop-down planPlan-guidedZero-shotSame complexity
Least-to-MostBottom-upEasiest firstFew-shot (decomp. examples)Harder than exemplars
Tree-of-ThoughtBranchingBFS/DFS with evaluationFew-shotSame complexity
Decomposed PromptingModular sub-promptsArbitraryFew-shotBroader coverage

The Difficulty Ladder

Least-to-Most works by constructing a difficulty ladder. Each step should be only slightly harder than the previous:

Subproblem 1: One operation     (difficulty: 1/10)
Subproblem 2: Two operations    (difficulty: 3/10)
Subproblem 3: Three operations  (difficulty: 6/10)
Subproblem 4: Full problem      (difficulty: 9/10)

If the jump between subproblems is too large, the model fails at the transition. The decomposition prompt should enforce gradual difficulty increase.

Cost Analysis

PhaseCallsCost
Decomposition11x
Solving (N subproblems)NN × 1x
TotalN + 1(N + 1)x

For a problem decomposed into 3 subproblems: 4 total LLM calls (1 decomposition + 3 solving). This is more expensive than single-call CoT, but the accuracy gain on hard problems makes it worthwhile.

When to pay the cost:

  • The problem is harder than your few-shot examples
  • Standard CoT gets it wrong reliably
  • Wrong answers have high cost (medical, financial, legal)
  • You can batch the decomposition phase across similar problems

Real-World Use Cases

Beyond the SCAN benchmark, Least-to-Most applies to:

  • Multi-hop QA: "Who is the CEO of the company that acquired Instagram?" → Decompose: "What company acquired Instagram?" → "Who is the CEO of that company?"
  • Code generation: "Write a function that sorts a list, removes duplicates, and returns the top 3" → Decompose into sort, deduplicate, top-k
  • Data pipeline design: Decompose complex ETL into sequential transformation steps
  • Legal reasoning: Break a statute interpretation into definition lookup → condition check → application