Multi-Agent Orchestrator

A manager agent that decomposes complex tasks and delegates them to specialized worker agents — a researcher, a coder, and a writer. Each worker has its own system prompt and tool set, but all share the same LLM client. The manager decides who does what, in what order, and aggregates the results into a final output. No orchestration framework needed.

Note:

This blueprint demonstrates the manager-worker architecture used by frameworks like CrewAI and AutoGen. The workers are defined as simple sub-agents in the same Python process — no inter-process communication, no message queues. Suitable for tasks that benefit from specialized perspectives (research → code → write).

Agent File Structure

multi-agent-orchestratoradd

orchestrator.pyadd

workers.pyadd

config.jsonadd

Setup

Install Dependencies

Install the OpenAI client.

pip install openai

Create config.json

Configure the orchestrator. All workers share the same model.

{
  "openai_api_key": "sk-...",
  "model": "gpt-4o",
  "max_iterations": 10,
  "worker_temperature": 0.3,
  "verbose": true
}

Verify

Run a multi-step task to verify all workers function.

python orchestrator.py --task "Research the carbon footprint of AI model training, write a Python script to estimate CO2 emissions by model size, then produce a 1-page executive summary."

The orchestrator should delegate to researcher → coder → writer and return a unified result.

System Prompt

You are a task orchestrator. Your job is to decompose complex tasks and delegate
subtasks to specialized workers. You manage three workers:

- RESEARCHER: Searches and synthesizes information. Provides factual, cited answers.
- CODER: Writes and explains Python code. Produces working, documented scripts.
- WRITER: Produces polished prose — summaries, reports, articles, documentation.

Follow this protocol:
1. THOUGHT: What does the user need? Which workers are required? In what order?
2. PLAN: Decompose the task into subtasks, each assigned to a specific worker
3. DELEGATE: Send each subtask to the appropriate worker, one at a time
4. AGGREGATE: Combine worker outputs into a cohesive final result
5. FINAL_OUTPUT: The complete, integrated answer to the user's request

Rules:
- Delegate subtasks sequentially when outputs depend on each other (research → code → write)
- Delegate in parallel when subtasks are independent
- Pass relevant context from earlier workers to later workers
- If a worker produces low-quality output, re-delegate with more specific instructions
- The final output should read as one cohesive piece, not three separate sections


### Worker Definitions

```python
# workers.py
import json

WORKER_SYSTEM_PROMPTS = {
    "researcher": """You are a research specialist. Your job is to gather and synthesize
information on a given topic. Provide factual, well-organized answers with specific
data points where available. Cite your sources when possible.

Rules:
- Be thorough — cover multiple angles of the topic
- Use specific numbers and data, not vague claims
- Organize findings with clear headings
- If information is uncertain, note the uncertainty""",

    "coder": """You are a Python programmer. Your job is to write clean, documented,
working Python code that solves the given problem. Include comments explaining
key decisions.

Rules:
- Write complete, runnable scripts — no placeholders or TODOs
- Include type hints and docstrings
- Handle edge cases and errors gracefully
- Add a brief explanation of how the code works
- Use only standard library or widely-available packages""",

    "writer": """You are a professional writer and editor. Your job is to produce
polished, clear, and engaging prose from the provided information.

Rules:
- Adapt your tone to the audience and purpose specified
- Structure content with clear headings and logical flow
- Be concise — cut unnecessary words
- Maintain a consistent voice throughout
- If source material has gaps, note them rather than fabricating"""
}


def run_worker(client, model, worker_name, task, temperature=0.3):
    """Run a single worker on a task and return its output."""
    system_prompt = WORKER_SYSTEM_PROMPTS.get(worker_name)
    if not system_prompt:
        return f"ERROR: Unknown worker: {worker_name}. Available: {list(WORKER_SYSTEM_PROMPTS.keys())}"

    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": task}
    ]

    response = client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=temperature
    )

    return response.choices[0].message.content

Tool Definitions

Orchestrator Tools

delegate_to

Delegate a subtask to a specific worker. Returns the worker's output.

Values: worker_name: 'researcher' | 'coder' | 'writer', task: string

aggregate_results

Synthesize outputs from multiple workers into a cohesive final result. Uses LLM.

Values: task: string, worker_outputs: object[]

Tool Implementation

# tools.py (embedded in orchestrator.py for simplicity in this blueprint)
import json

def delegate_to(client, model, worker_name, task, temperature):
    """Delegate a task to a worker and return the result."""
    from workers import run_worker
    output = run_worker(client, model, worker_name, task, temperature)
    return json.dumps({
        "worker": worker_name,
        "task": task,
        "output": output
    }, indent=2)


def aggregate_results(client, model, original_task, worker_outputs):
    """Synthesize worker outputs into a cohesive final result."""
    outputs_text = "\n\n---\n\n".join(
        f"Worker: {w.get('worker', 'unknown')}\nTask: {w.get('task', '')}\nOutput: {w.get('output', '')}"
        for w in worker_outputs
    )

    prompt = f"""Synthesize the following worker outputs into one cohesive response
for the original task. The final output should read as a single piece, not separate
sections from different workers.

Original task: {original_task}

Worker outputs:
{outputs_text}

Combine these into a unified, well-structured answer. Use the writer's output as
the narrative backbone, and integrate the researcher's facts and coder's script
where appropriate."""

    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        temperature=0.2
    )
    return response.choices[0].message.content

Orchestrator Loop

# orchestrator.py
import json
import argparse
from openai import OpenAI
import workers
import tools

TOOL_SCHEMAS = [
    {
        "type": "function",
        "function": {
            "name": "delegate_to",
            "description": "Delegate a subtask to a specialized worker: researcher, coder, or writer",
            "parameters": {
                "type": "object",
                "properties": {
                    "worker_name": {"type": "string", "enum": ["researcher", "coder", "writer"]},
                    "task": {"type": "string", "description": "The subtask for the worker to complete"}
                },
                "required": ["worker_name", "task"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "aggregate_results",
            "description": "Synthesize outputs from multiple workers into a cohesive final result",
            "parameters": {
                "type": "object",
                "properties": {
                    "task": {"type": "string", "description": "The original user task"},
                    "worker_outputs": {
                        "type": "array",
                        "items": {
                            "type": "object",
                            "properties": {
                                "worker": {"type": "string"},
                                "task": {"type": "string"},
                                "output": {"type": "string"}
                            }
                        }
                    }
                },
                "required": ["task", "worker_outputs"]
            }
        }
    }
]

SYSTEM_PROMPT = """You are a task orchestrator. You manage three specialized workers:
- RESEARCHER: Searches and synthesizes information. Provides factual, cited answers.
- CODER: Writes and explains Python code. Produces working, documented scripts.
- WRITER: Produces polished prose — summaries, reports, articles.

Protocol:
1. THOUGHT: What does the user need? Which workers? What order?
2. PLAN: Decompose the task into subtasks for specific workers
3. DELEGATE: Send subtasks to workers sequentially (outputs feed forward) or in parallel
4. AGGREGATE: Combine outputs into a cohesive result
5. FINAL_OUTPUT: The complete answer

Rules:
- Delegate sequentially when outputs depend on earlier results
- Pass relevant context when delegating to later workers
- If a worker output is poor, re-delegate with clearer instructions
- The final output must read as one cohesive piece"""


def run_orchestrator(task: str, config: dict):
    client = OpenAI(api_key=config["openai_api_key"])
    model = config.get("model", "gpt-4o")
    worker_temp = config.get("worker_temperature", 0.3)
    verbose = config.get("verbose", True)

    worker_outputs = []

    messages = [
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "user", "content": f"Complete this task: {task}"}
    ]

    for i in range(config.get("max_iterations", 10)):
        response = client.chat.completions.create(
            model=model,
            messages=messages,
            tools=TOOL_SCHEMAS,
            temperature=0.2
        )

        msg = response.choices[0].message
        messages.append(msg)

        if msg.content and "FINAL_OUTPUT:" in msg.content:
            return msg.content.split("FINAL_OUTPUT:", 1)[1].strip()

        if not msg.tool_calls:
            messages.append({
                "role": "user",
                "content": "Continue. Delegate subtasks to workers, then aggregate results. End with FINAL_OUTPUT."
            })
            continue

        for tool_call in msg.tool_calls:
            func_name = tool_call.function.name
            func_args = json.loads(tool_call.function.arguments)

            if func_name == "delegate_to":
                worker_name = func_args.get("worker_name")
                worker_task = func_args.get("task", "")
                if verbose:
                    print(f"\n  → Delegating to {worker_name}: {worker_task[:80]}...")
                result = tools.delegate_to(
                    client, model, worker_name, worker_task, worker_temp)
                worker_outputs.append({
                    "worker": worker_name,
                    "task": worker_task,
                    "output": result
                })

            elif func_name == "aggregate_results":
                result = tools.aggregate_results(
                    client, model,
                    func_args.get("task", task),
                    func_args.get("worker_outputs", worker_outputs))

            else:
                result = f"Unknown tool: {func_name}"

            messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": result
            })

    return "Orchestrator reached max iterations."


if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--task", required=True, help="Complex task to decompose and delegate")
    parser.add_argument("--config", default="config.json")
    args = parser.parse_args()

    with open(args.config) as f:
        config = json.load(f)

    result = run_orchestrator(args.task, config)
    print("\n" + "=" * 60)
    print(result)

Walkthrough

Researching the carbon impact of AI model training, writing estimation code, and producing an executive summary.

Orchestrator plans the workflow

The manager receives the task: "Research carbon footprint of AI model training, write Python to estimate CO2 by model size, produce a 1-page executive summary."

It plans: RESEARCHER → CODER → WRITER (sequential — each depends on the previous).

Delegates to Researcher

delegate_to("researcher", "Research the carbon footprint of AI model training. Include: how carbon emissions are estimated, average emissions for small/medium/large models, the PUE factor, and recent findings from papers like Strubell 2019 and Patterson 2021.")

Researcher returns a 6-paragraph brief with key data: training GPT-3 emitted ~552 tons CO2eq, a single training run of a large transformer uses as much carbon as 5 cars over their lifetimes, PUE (Power Usage Effectiveness) multiplies energy by 1.1-1.6x.

Delegates to Coder with research context

delegate_to("coder", "Write a Python script that estimates CO2 emissions for training AI models. Use these parameters from research: [research_summary]. The script should accept model size (parameters), training hours, GPU type, and PUE as inputs. Output estimated CO2 in kg. Include a function that compares multiple model sizes in a table.")

Coder produces a 60-line script with estimate_co2(), compare_models(), and hardcoded GPU power draw constants (A100: 400W, H100: 700W).

Delegates to Writer with all context

delegate_to("writer", "Write a 1-page executive summary on the carbon footprint of AI training. Use these research findings and include reference to the estimation script. [research_output + code_output]. Tone: professional, actionable. Audience: CTO and engineering leadership.")

Writer produces a polished executive summary with a table of model sizes vs CO2 estimates, 3 recommendations (schedule training during low-carbon grid hours, track emissions per experiment, preference smaller models with fine-tuning), and a call to action.

Aggregates into FINAL_OUTPUT

aggregate_results combines all three worker outputs into one cohesive document:

FINAL_OUTPUT:

CARBON FOOTPRINT OF AI MODEL TRAINING
Executive Summary for Engineering Leadership

KEY FINDINGS
Training large AI models carries a significant and growing carbon
footprint. GPT-3's training run emitted approximately 552 tons of CO2eq
— equivalent to 5 cars over their entire lifetimes. A single training
experiment on 8×A100 GPUs over 2 weeks emits ~350 kg CO2eq (assuming
average US grid intensity of 0.4 kg CO2/kWh). Over a year of active
experimentation, a 10-person ML team's compute emissions can exceed
those from their office operations, commuting, and travel combined.

ESTIMATION TOOL
The attached Python script (estimate_co2.py) lets you input model
size, GPU type, training hours, and PUE to generate CO2 estimates.
Example output:

| Model Size | GPUs | Hours | Est. CO2 (kg) | Equivalent        |
|------------|------|-------|----------------|--------------------|
| 1.3B params | 4×A100 | 24   | 15.4           | 1 flight NYC→LA    |
| 13B params  | 8×A100 | 168  | 538            | 2 cars/year        |
| 175B params | 1024×A100 | 720 | 55,300 | 12 cars lifetimes |

RECOMMENDATIONS
1. Schedule large training runs during low-carbon grid hours (nighttime,
   weekends) — can reduce emissions by 20-40%.
2. Track emissions per experiment as a key metric alongside accuracy
   and loss. Make it visible in experiment tracking dashboards.
3. Prefer smaller models fine-tuned on domain data over training
   large models from scratch. Fine-tuning emits <1% of pre-training.

Customization

Orchestrator Settings

model

gpt-4o recommended for complex task decomposition. The orchestrator needs strong reasoning to plan delegation order.

Values: gpt-4o, gpt-4o-mini

worker_temperature

Temperature for worker LLM calls. Higher values produce more creative writing and code; lower values produce more factual research.

Values: 0.0 - 1.0 (default 0.3)

max_iterations

Orchestrator loop iterations. Increase for tasks requiring many sequential delegations or retries.

Values: 1-15 (default 10)

verbose

Print delegation steps to console. Useful for understanding the orchestrator's decision-making.

Values: true, false

Note:

Sequential delegation latency. Each worker call is a full LLM API round-trip. A 3-worker pipeline (researcher → coder → writer) takes 3 serial API calls plus the orchestrator's own rounds. Total latency is typically 15-45 seconds. For latency-sensitive applications, consider delegating independent subtasks in parallel.

Key Takeaway

A multi-agent orchestrator is most valuable when tasks genuinely benefit from different perspectives — factual research, technical code, and polished writing are distinct skills that a single system prompt struggles to hold simultaneously. The orchestrator's real job is context management: making sure the coder gets the researcher's specific numbers, and the writer gets both the research AND the code output. Get the handoffs right, and the final output reads as if one expert did everything.

Multi-Agent Orchestrator Blueprint