Skip to Content

ActorCritic

from loomflow.architecture import ActorCritic

Generator + adversarial critic, asymmetric by design. The actor and critic are distinct Agent instances. Typically with different models for blind-spot diversity.

For the conceptual page see ActorCritic.


Class signature

class ActorCritic: name: str = "actor-critic" def __init__( self, *, actor: Agent, critic: Agent, max_rounds: int = 3, approval_threshold: float = 0.9, critique_template: str | None = None, refine_template: str | None = None, ) -> None: ...

Constructor parameters

actor

TypeAgent
Defaultrequired

The Agent that generates and refines outputs. Typically your main production model.

critic

TypeAgent
Defaultrequired

The Agent that critiques outputs. Use a different model from the actor for blind-spot diversity (Anthropic for actor, OpenAI for critic, or vice versa). The critic must return structured JSON matching CriticOutput:

class CriticOutput(BaseModel): score: float # in [0.0, 1.0] issues: list[str] # specific problems to address summary: str # one-line assessment

The default critique_template instructs the critic to emit this shape; override only if you customize the schema.

max_rounds

Typeint
Default3

Maximum critique → refine rounds (after the initial actor generation). Each round is one critic call + one refine call. Must be >= 1.

approval_threshold

Typefloat
Default0.9

Critic score (in [0.0, 1.0]) at or above which the loop terminates. Must be in [0.0, 1.0]. Higher = stricter (more refinement rounds); lower = lenient.

critique_template

Typestr | None
DefaultNone (uses built-in default)

Override the prompt sent to the critic. Must instruct the model to return strict JSON {"score": 0..1, "issues": [...], "summary": "..."}.

refine_template

Typestr | None
DefaultNone (uses built-in default)

Override the prompt sent to the actor for refinement rounds. The default takes the current output + the critique JSON and asks for a revised output addressing the issues.


Methods

declared_workers

def declared_workers(self) -> dict[str, Agent]: return {"actor": self._actor, "critic": self._critic}

Both workers exposed for AgentGraph visualization and introspection.

run

  1. Round 0 (actor). actor.run(prompt) produces an initial output. If the actor interrupts itself (max_turns, budget), terminate.
  2. Critique → refine loop (1..max_rounds):
    • Budget check. Block / warn as needed.
    • Critic. critic.run(critique_prompt) returns CriticOutput. If score >= approval_threshold → terminate.
    • Refiner. actor.run(refine_prompt) produces a revised output. Replaces the current output.

Per-round events: actor_critic.actor_started, actor_critic.actor_completed, actor_critic.critic_started, actor_critic.critic_completed, actor_critic.score, actor_critic.refine_started, actor_critic.refine_completed.


Cost model

Round 0: 1 actor run. Each subsequent round: 1 critic + 1 actor refine. For max_rounds=3 worst case: 1 + 3 × 2 = 7 agent runs (each agent run is itself N model calls under ReAct or whatever architecture the actor / critic uses).

Typical use case (code generation with security critic): 2–4× a single-actor ReAct run, depending on rounds.


Why a separate SelfRefine?

SelfRefine runs critic + refiner with the parent’s same model and same prompt template. ActorCritic earns its complexity only when the actor and critic have different blind spots. Typically different models. ActorCritic requires both actor and critic Agent instances; for same-model self-critique, use SelfRefine.


Example

from loomflow import Agent from loomflow.architecture import ActorCritic from loomflow.team import Team actor = Agent( "Generate the code.", model="claude-opus-4-7", ) critic = Agent( "Find bugs, security issues, and edge cases. Return strict JSON: " '{"score": 0..1, "issues": [...], "summary": "..."}', model="gpt-4o", # different model on purpose ) team = Team.actor_critic( actor=actor, critic=critic, max_rounds=3, approval_threshold=0.9, model="claude-opus-4-7", # the coordinator ) result = await team.run("Write a function that sanitizes user input.")

Or via the explicit Architecture form for nesting:

from loomflow import Agent from loomflow.architecture import ActorCritic agent = Agent( "Manage actor + critic.", model="claude-opus-4-7", architecture=ActorCritic( actor=actor, critic=critic, max_rounds=3, approval_threshold=0.9, ), )

Source

loomflow/architecture/actor_critic.py

Different models matter. The whole point is blind-spot diversity. Same-model actor and critic share the same blind spots, so the gain collapses. Use Anthropic for one + OpenAI for the other, or two different sizes (large for actor, small fast critic for cost) , anything but identical.

Last updated on