Auto-extract observability

AutoExtractMemory runs a small LLM extraction pass after every agent.run() to pull structured (subject, predicate, object) facts into the bi-temporal store. It’s on by default for real network adapters, which means it’s also on your bill. You should know when it fires and how long it takes.

Telemetry signals

Two signals are emitted when Agent(telemetry=...) is wired:

Metric	Type	Tags	Use
`loom.auto_extract.duration_ms`	histogram	`user_id`, `status` (`ok`/`error`)	Latency budget per tenant
`loom.auto_extract.invocations`	counter	`user_id`, `status`	Failure rate; cost attribution

Startup notice

A one-time-per-process INFO log notice tells you when the default-on heuristic fires:


INFO  loomflow.memory.auto_extract: AutoExtractMemory enabled
by default for this model class. Each remembered episode triggers
a small extraction call to pull (subject, predicate, object)
facts. Pass Agent(auto_extract=False) to disable, or
Agent(auto_extract=True) to silence this notice.

Toggling

To disable for cost reasons, pass auto_extract=False:


agent = Agent("...", model="gpt-4o", auto_extract=False)

To enable explicitly (and silence the startup notice):


agent = Agent("...", model="gpt-4o", auto_extract=True)