RetryPolicy + error taxonomy

Network model adapters (AnthropicModel / OpenAIModel / LiteLLMModel) auto-wrap their stream() calls in a typed retry policy. You don’t write try / except for transient failures , the loop retries with exponential backoff and gives up cleanly on permanent errors.

Default policy


3 attempts · 1s → 2s → 4s exponential backoff
capped at 30s · ±10% jitter · honours provider Retry-After

Roughly equivalent to:


from loomflow import Tuning
from loomflow.governance import RetryPolicy
 
agent = Agent(
    "...",
    model="claude-opus-4-7",
    tuning=Tuning(retry_policy=RetryPolicy.default()),
)

For most users this is invisible. The agent just keeps working through provider blips.

Tuning the policy


from loomflow import Tuning
from loomflow.governance import RetryPolicy
 
# Aggressive — tolerates long provider outages
agent = Agent("...", tuning=Tuning(retry_policy=RetryPolicy.aggressive()))
 
# Disabled — handle errors yourself
agent = Agent("...", tuning=Tuning(retry_policy=RetryPolicy.disabled()))
 
# Custom
agent = Agent("...", tuning=Tuning(retry_policy=RetryPolicy(
    max_attempts=5,
    base_delay_s=2.0,
    max_delay_s=60.0,
    jitter=0.2,
    honor_retry_after=True,
)))

Field	Default	Effect
`max_attempts`	3	Total attempts including the first.
`base_delay_s`	1.0	First backoff.
`max_delay_s`	30.0	Cap on the exponential growth.
`jitter`	0.1	±jitter fraction applied to each delay.
`honor_retry_after`	True	Use the provider’s `Retry-After` header when present.

Error taxonomy

Adapters classify provider exceptions into a typed hierarchy:


LoomError
├── ModelError                       (base for any model issue)
│   ├── TransientModelError          (retried)
│   │   ├── RateLimitError           (429; retry-after honored)
│   │   └── ...                      (5xx, network timeouts, connection resets)
│   ├── PermanentModelError          (NOT retried)
│   │   ├── AuthenticationError      (401)
│   │   ├── InvalidRequestError      (400 — bad prompt, missing field)
│   │   ├── ContentFilterError       (provider safety filter)
│   │   └── ...
│   └── OutputValidationError        (output_schema= validation failed)
└── ...

classify_model_error(exc) is the helper the adapters use; you can call it from your own code:


from loomflow.governance import classify_model_error
 
try:
    ...
except Exception as exc:
    typed = classify_model_error(exc)
    if isinstance(typed, RateLimitError):
        ...

What gets retried

Error	Retried?
`RateLimitError` (429)	yes, with `Retry-After` honored
`TransientModelError` (5xx, network blips, timeouts)	yes
`PermanentModelError` (401, 400, content filter)	no. Fail fast
`OutputValidationError` (schema validation failed)	no. Handled separately

For OutputValidationError the framework follows a different path: it appends the validation message to the conversation and asks the model to retry, up to a separate output_schema_max_retries limit.

What about tool errors?

Tool errors are not retried at the model layer. Each tool’s exception is captured in its ToolResult(ok=False, error=...); the model sees the error in the next turn and can decide whether to retry. To retry at the framework level, wrap the tool body yourself:


@tool
async def fetch(url: str) -> str:
    """Fetch a URL with up to 3 retries."""
    for attempt in range(3):
        try:
            return await client.get(url)
        except httpx.NetworkError:
            if attempt == 2:
                raise
            await asyncio.sleep(2**attempt)

Observability

Retry attempts emit structured logs at WARN level:


WARN  loomflow.model.retrying: retrying after RateLimitError;
attempt 2/3, sleeping 4.2s (provider Retry-After=4.0).

When telemetry=OTelTelemetry(...) is wired, the retry count is attached to the loom.model.stream span as the loom.model.retries attribute.

Don’t double-retry. If you’ve configured your provider client with its own max_retries=3, set it to max_retries=0 and let the framework’s RetryPolicy own the retry loop. Otherwise you compound 3×3 = 9 attempts on a single call and the user-visible latency explodes.