ActorCritic

Generator + adversarial critic, asymmetric by design. The 2026 production literature recommends ActorCritic for quality-critical work. Code generation, security review, important written communications.

The pattern in one line: actor proposes; critic finds problems with a different prompt and ideally a different model; actor revises.


   prompt ───► actor ───► critic ──┬── score ≥ threshold ──► output
                  ▲                │
                  │                └── below ──► refine (apply rubric)
                  │                                  │
                  └──────────── max_rounds cap ──────┘

The critic returns structured JSON ({score, issues, summary}); the actor refines below threshold.

Why a separate Self-Refine?

Self-Refine runs critic + refiner with the parent’s same model and same prompt template. ActorCritic earns its complexity only when the actor and critic have different blind spots. Typically different models. ActorCritic requires both actor and critic Agent instances; for same-model self-critique, use Self-Refine.

Usage


from loomflow import Agent
from loomflow.team import Team
 
actor = Agent(
    "Generate the code.",
    model="claude-opus-4-7",
)
 
critic = Agent(
    "Find bugs, security issues, and edge cases. Return strict JSON: "
    '{"score": 0..1, "issues": [...], "summary": "..."}',
    model="gpt-4o",  # different model on purpose
)
 
team = Team.actor_critic(
    actor=actor,
    critic=critic,
    max_rounds=3,
    approval_threshold=0.9,
    model="claude-opus-4-7",  # coordinator
)
 
result = await team.run("Write a function that sanitizes user input.")

When to use

Code generation. Actor generates, critic looks for bugs / security issues / missing edge cases.
Security review. Actor drafts a policy; critic challenges it against an attacker model.
Important written communication. Actor drafts; critic checks tone / accuracy / sensitive content.

Cost: 2–4× a single-actor ReAct run, depending on rounds. Pays off when shipping a wrong answer is more expensive than the extra calls.

Different models matter. The whole point is blind-spot diversity. Same-model actor and critic share the same blind spots, so the gain collapses. Use Anthropic for the actor + OpenAI for the critic, or vice versa, or two different model sizes (large for actor, small fast critic for cost). Anything but identical.