Gemini Built-in Code Execution: Python Sandbox Mastery

Harness Gemini's Python code execution sandbox. Learn self-verification patterns, data analysis with pandas, iterative problem-solving, and error recovery techniques.

June 14, 2026
GeminiCode ExecutionPythonSandboxData AnalysisPrompt Engineering

Gemini's built-in code execution capability is unique among frontier LLMs. Gemini can write Python code, execute it in a sandboxed environment, observe the output, and use that output to refine its answer — all within a single API response. This transforms Gemini from a text generator into a computation engine.

The use cases are substantial: data analysis with real computation, self-verifying mathematical reasoning, chart generation, file processing, and iterative problem-solving where Gemini tests its own hypotheses against actual results.

But the code execution sandbox has limits. It can't access the network, can't read your files, can't install arbitrary packages, and has a timeout. Understanding these boundaries — and how to prompt within them — is essential.

Enabling Code Execution

Code execution must be enabled at the API level. It is not on by default.

{
  "tools": [
    {
      "codeExecution": {}
    }
  ]
}

When enabled, Gemini can decide whether to use code execution for a given prompt. You can also instruct it explicitly in your prompt.

Core Patterns

Pattern 1: Self-Verification

The fundamental pattern: generate code → execute → verify output → correct.

Solve this problem step by step:

1. Write Python code to solve the problem
2. Execute the code and report the output
3. If the output matches expectations, explain why it's correct
4. If the output is wrong, analyze what went wrong, fix the code,
   and re-execute
5. Repeat until correct, or report that you can't solve it after 3 attempts

PROBLEM: Calculate the probability of drawing at least two aces
in a 5-card poker hand dealt from a standard 52-card deck.
Show your work.

Note:

Always set a maximum number of attempts (3 is a good default). Without a limit, Gemini can get stuck in loops where it repeatedly generates slightly different wrong answers without converging.

Pattern 2: Data Analysis

I'll provide a dataset. Use code execution to analyze it.

1. Load the data into a pandas DataFrame
2. Run descriptive statistics: mean, median, std, quartiles for all
   numeric columns
3. Identify and report any outliers (values > 3 std from mean)
4. Generate a correlation matrix for numeric columns
5. Create visualizations (matplotlib):
   - Histogram of the primary metric
   - Scatter plot of the two most correlated variables
   - Box plot by category if categorical columns exist
6. Summarize the 3 most important findings from the analysis

DATA:
[your dataset]

Pattern 3: Iterative Problem Solving

APPROACH: Iterative improvement

1. Start with the simplest approach that might work
2. Write code, execute, observe results
3. Based on results, refine the approach
4. Repeat until the solution is optimal or 5 iterations pass

At each iteration, report:
- Approach: what you're trying
- Code: the implementation
- Results: what happened when executed
- Insights: what you learned
- Next step: what you'll try differently

PROBLEM: Find the shortest path through a 50-city traveling
salesman problem using heuristic approaches. Cities are at
random coordinates in a 100x100 grid.

Pattern 4: Calculation Verification

For any answer that involves computation, prompt Gemini to verify:

For any calculation in your response:
1. Show the formula
2. Implement it in Python and execute
3. Report the computed result
4. Compare with your text answer

If there's a discrepancy, the computed result is authoritative.

This pattern has caught calculation errors that would otherwise go unnoticed, especially in financial and statistical responses.

What the Sandbox Can and Cannot Do

CapabilitySupported?Notes
Pure Python computationYesAll standard library modules
pandas, numpy, matplotlibYesPre-installed
scipy, scikit-learnVariesCheck current availability
Network accessNoNo HTTP, no socket connections
File I/OSandbox onlyCan write/read within sandbox; cannot access your files
Subprocess / OS commandsNoNo shell access
External packages (pip install)NoOnly pre-installed libraries
Long-running computationLimitedTimeout applies (typically 30-60 seconds)
GPU computationNoCPU only

Note:

The sandbox is stateless between API calls. Data you generate in one call is not available in the next. If you need persistent computation across calls, generate the code in Gemini but execute it in your own environment.

Prompting for Code Execution

Good Prompt (triggers execution)

Calculate the compound annual growth rate of an investment
that grew from $10,000 to $52,000 over 8.5 years. Use Python
to compute the exact value and show the formula.

Bad Prompt (may not trigger execution)

What's the CAGR for $10K to $52K over 8.5 years?

Explicit Trigger

If Gemini doesn't use code execution when you want it to:

Please use the code_execution tool to compute this.

Error Recovery Patterns

Code in the sandbox can fail — syntax errors, runtime exceptions, timeout. Prompt Gemini to handle failures gracefully:

For any code execution:

1. Wrap your main logic in a try/except block
2. Catch specific exceptions (ValueError, ZeroDivisionError, etc.)
3. If execution fails:
   a. Report the exact error message
   b. Explain what caused it in plain language
   c. Attempt a fix and re-execute (max 2 retries)
4. If execution times out:
   a. Report that the computation was too expensive
   b. Suggest an optimized or approximate approach

Never silently fail. Always report errors.

Data Visualization

Gemini can generate charts using matplotlib directly in the sandbox:

Create a visualization of this sales data:

Month,Product A,Product B,Product C
Jan,1200,900,1500
Feb,1350,950,1400
...

Generate:
1. A line chart showing all three products over time
2. A stacked bar chart showing monthly composition
3. A pie chart of total annual sales by product

For each chart:
- Include title, axis labels, and legend
- Use a professional color scheme
- Display the chart
- Provide 2-3 sentences interpreting what the chart shows

Note:

When generating charts, always ask for interpretation alongside the visual. The chart image itself doesn't appear in the API response text — you get a rendering, but asking for textual interpretation ensures you'll have the analysis even when viewing the raw response.

Common Failures

FailureCauseFix
No execution triggeredPrompt doesn't signal computation neededAdd "Use Python to calculate" explicitly
Infinite computation loopsNo iteration limitSet max attempts (3-5)
Massive data in promptIncluding full datasets as textSummarize data shape; feed essential rows
Timeout on large computationAlgorithm too expensive for sandboxAsk for approximate or optimized approach
Missing libraryNeeded package not pre-installedUse standard library fallback or pre-installed alternatives