Claude Computer Use Prompting: UI Targets & Action Sequences
Master Claude's computer use capability. Learn to describe UI targets, structure action sequences, specify error recovery, and prompt for reliable autonomous GUI operation.
Claude's computer use capability lets it view screenshots, move the mouse, click buttons, type text, scroll, and navigate interfaces — essentially operating a computer like a human would. This opens automation possibilities that traditional scripting can't touch: legacy apps without APIs, complex multi-step workflows, and interfaces that change between runs.
But computer use prompting is a distinct discipline. You're not describing desired output — you're describing UI targets, action sequences, error conditions, and recovery strategies.
The Computer Use Prompt Structure
Every computer use prompt needs:
- Goal — What should be accomplished
- UI description — What the interface looks like (landmarks)
- Action sequence — Step-by-step operations
- Verification — How to confirm each step succeeded
- Error recovery — What to do when things go wrong
Basic Computer Use Prompt
GOAL: Download the Q3 financial report from the company intranet.
INTERFACE CONTEXT:
- You're starting at the intranet home page (https://intranet.company.com)
- The navigation bar is at the top with links: Home, Documents, HR, Finance, IT
- The Finance section contains quarterly reports
ACTION SEQUENCE:
1. Look at the navigation bar. Click "Finance."
2. On the Finance page, find the "Quarterly Reports" section.
3. Look for "Q3 2025 Financial Report" — it should be a PDF link.
4. Click the download icon next to the report name.
5. Wait for the download to start (look for browser download indicator).
VERIFICATION:
- After step 1: The URL should change to /finance
- After step 4: A download notification should appear in the browser
ERROR RECOVERY:
- If "Finance" link is not visible: scroll down and look again
- If "Q3 2025" report is not listed: check if it's under "Archived Reports" or "2025 Reports"
- If download doesn't start: check if a popup blocker notification appeared
- If you're asked to log in: STOP and report — you don't have credentials
Describing UI Targets
Claude sees screenshots but doesn't "know" UI element types. Describe targets in visual terms:
Good UI Target Descriptions
"Click the blue 'Submit' button in the bottom-right corner of the form."
"Type in the input field labeled 'Email address' — it has a gray placeholder text."
"Click the checkbox next to 'I agree to terms' — it's below the main form."
"Click the dropdown menu that currently says 'Select country' and choose 'Canada'."
"Scroll down until you see the section titled 'Payment History' with a table of transactions."
Poor UI Target Descriptions
"Click the submit button." (Which one? Where?)
"Click element #submit-btn." (Claude doesn't see CSS selectors)
"Type the email." (Where? What field?)
"Click the third button." (Which order? Left to right? Top to bottom?)
UI Landmarking
Before a complex task, have Claude identify landmarks:
First, scan this page and identify the following landmarks:
- Navigation areas (top bar, sidebar, tabs)
- Main content area
- Any forms or input fields
- Buttons visible on screen (list their text and approximate location)
- Any error messages or notifications currently showing
Then report what you see before taking any action.
Action Sequence Patterns
The Checkpoint Pattern
Insert verification after every significant action:
1. Navigate to https://app.example.com/login
VERIFY: Login page is displayed with email and password fields
2. Type "[email protected]" in the email field
VERIFY: Email appears in the field
3. Type the password in the password field
VERIFY: Password field shows dots (not plaintext)
4. Click "Sign In" button
VERIFY: Either dashboard loads (success) OR error message appears
5. If error: [recovery steps]
If success: [continue]
The Branching Pattern
Navigate to the user settings page and change the theme to "Dark."
PATH A — Normal flow:
1. Click the user avatar in the top-right corner
2. Click "Settings" in the dropdown
3. Click "Appearance" tab
4. Select "Dark" from the theme options
5. Click "Save changes"
6. VERIFY: "Settings saved" confirmation appears
PATH B — If "Settings" is not in dropdown:
- The user might not have settings access
- Report: "Settings option not available for this account"
PATH C — If "Appearance" tab is not visible:
- Scroll down in the settings page
- If still not found, check if it's under "Display" or "Theme" instead
PATH D — If "Save changes" button doesn't appear:
- The theme might auto-save — look for "Theme updated" toast/indicator
- If no indicator, take a screenshot and report the ambiguity
Error Recovery Patterns
The Stale Screenshot Problem
Screenshots can be outdated if the interface changed between the screenshot and the action:
Before EVERY action:
1. Take a fresh screenshot
2. Compare it to what you expect to see
3. If the screen looks different from expected, describe the difference
4. If the difference is minor (e.g., a notification appeared), proceed
5. If the difference is major (e.g., you're on the wrong page), pause and reassess
The Infinite Loop Prevention
If you attempt the same action 3 times without success:
1. STOP attempting that action
2. Describe what you're trying to do and what's happening
3. Suggest 3 alternative approaches
4. WAIT for human guidance before trying any alternative
The Recovery Prompt Template
Something went wrong. The expected result was: [expected].
The actual result was: [actual — describe what you see].
Analyze what might have happened:
- Wrong page/state? Navigate back to [correct page]
- Element not found? Look for similar elements with different text/labels
- Popup/overlay blocking? Try pressing Escape or looking for close buttons
- Loading/processing? Wait 3 seconds and re-check
If you can identify the likely cause, attempt ONE fix. If the fix doesn't work,
report the situation and WAIT.
Full Computer Use Prompt Template
TASK: [One-sentence description of what to accomplish]
STARTING STATE:
- Current URL: [if known]
- What should be visible: [describe initial screen]
- You are logged in as: [role/permissions if relevant]
ACTION PLAN:
1. [Step 1]
EXPECT: [what you should see after this step]
2. [Step 2]
EXPECT: [what you should see after this step]
3. [Step 3]
EXPECT: [what you should see after this step]
VERIFICATION:
- Success looks like: [concrete success indicators]
- Take a screenshot of the final state
ERROR HANDLING:
- If [common problem]: [recovery action]
- If screen doesn't match expectations at any step: PAUSE and describe what you see
- After 3 failed attempts at the same step: STOP and report
CONSTRAINTS:
- DO NOT: [actions to avoid — e.g., delete, submit without confirmation]
- STOP if: [conditions that require human intervention]
Note:
Common Pitfall: Describing actions in terms of keyboard shortcuts that Claude can't use in the same way. "Press Ctrl+F and search for..." — Claude needs to click the browser's find bar, not send keyboard shortcuts. Describe UI interactions, not abstract commands.
Related Pages
- Human-in-the-Loop Patterns — Before deploying any computer use workflow, implement safety checkpoints. The risk-based autonomy model is essential reading.
Related Articles
Self-Refine: Iterative Self-Improvement
Use one LLM to generate, critique, and refine its own output in a feedback loop. Boost quality on code gen, writing, and math without external models or training data.
Nano Banana Text & Design: Typography and Graphics Guide
Master text rendering and graphic design with Nano Banana. Create posters, mockups, infographics, and typography with AI-powered design prompts.
Mastering Midjourney Prompts: Create Stunning Interior Spaces
Master Midjourney prompts to create stunning interior spaces. Learn to define architectural elements, lighting, materials, and composition for breathtaking interior environments.