Multimodal Prompting

Multimodal prompting combines text with images, audio, or video to give AI models richer context. Modern models like GPT-4o, Claude 3.5, and Gemini can process multiple input types simultaneously, enabling more natural and capable interactions.

Text + Image Prompting

Image Analysis

[Attach image]
What objects are in this image? List them with their approximate positions.

Image Comparison

[Attach image 1]
[Attach image 2]

Compare these two designs. Identify:
1. Key differences
2. Which follows better UX principles
3. Specific improvements for each

Code from Screenshot

[Attach screenshot of code or UI]

Convert this to working code. Include:
- Exact layout structure
- All text content
- Styling details

Text + Audio Prompting

Transcription + Analysis

[Attach audio file]

1. Transcribe the audio
2. Identify key points discussed
3. Extract action items with owners
4. Note any decisions made

Voice Instructions

[Attach voice memo]

Based on these voice notes:
1. Create a structured outline
2. Fill in missing details where unclear
3. Suggest additional points to consider

Best Practices

Image Prompting

Be specific about what you want analyzed
Reference specific parts of the image when needed
Provide context for ambiguous images
Use high-quality, clear images

Audio Prompting

Specify if you need verbatim or summary
Note the language if not English
Indicate speaker identification needs
Mention background noise handling

Modality Combinations

Combination	Use Cases
Text + Image	Design review, code conversion, visual Q&A
Text + Audio	Meeting notes, voice memos, transcription
Text + Video	Content analysis, tutorial creation
Image + Text + Audio	Comprehensive documentation

Prompt Templates

Image Description:

Describe this image in detail, covering:
- Main subjects and their attributes
- Setting and background
- Colors, lighting, and mood
- Any text visible in the image

Visual Comparison:

Compare these two images focusing on:
1. Structural differences
2. Color and style variations
3. Quality and clarity
4. Which better achieves [stated goal]

Audio Summary:

From this audio recording:
1. Provide a 3-sentence summary
2. List key topics discussed
3. Extract direct quotes for important points
4. Identify any unresolved questions

Essay Structure

Learn how to organize and structure your academic essays effectively with these ChatGPT prompts.

Midjourney Character Creation: Master AI Prompts for Unique Designs

Master Midjourney character creation with AI prompts. Learn advanced techniques for generating compelling characters across diverse styles and genres, from portraits to fantasy beings, with detailed guides and examples.

Period Drama & Historical Cinematic SREF Codes

Midjourney SREF codes for authentic historical period aesthetics with proper lighting and color treatment.

Multimodal Prompting