Skip to Content
DocsConceptsMulti-tenancy

Multi-tenancy

The framework is multi-tenant by default. One Agent instance serves N users; the framework partitions everything.

from loomflow import Agent agent = Agent( "...", model="claude-opus-4-7", memory="postgres://...", budget=StandardBudget(BudgetConfig(per_user_max_tokens=100_000)), permissions=PerUserPermissions(policies=..., default=...), audit_log=FileAuditLog("./audit.jsonl"), ) # Same Agent, different user — partitioned automatically await agent.run("...", user_id="alice", session_id="conv_42") await agent.run("...", user_id="bob", session_id="conv_99")

Alice’s memory never surfaces in Bob’s recall. Alice’s tokens count against Alice’s budget. Audit entries land with user_id="alice". Permissions check against Alice’s policy.

What partitions by user_id

PrimitivePartition behaviour
MemoryEpisodes, working blocks, and facts all carry user_id. Recalls are scoped.
Memory.factsBi-temporal queries scoped to the user.
StandardBudgetTracks tokens / cost / wall-clock both globally and per user_id.
PerUserPermissionsRoutes the policy decision per user_id.
FileAuditLogEvery entry has a top-level user_id. HMAC covers it.
OTelTelemetry metricsEvery metric tagged with user_id.
RunContext (visible to tools)get_run_context().user_id.
approval_handlerReceives the live user_id so Slack approvals can be addressed correctly.

Memory partition contract

For every memory backend (InMemory / SQLite / Chroma / Postgres / Redis):

  • Recalls are scoped by user_id. A memory.recall(query, user_id="alice") will never return Bob’s episodes. Even if Bob’s episode is more semantically similar.
  • Writes carry user_id. memory.remember(episode) reads the active RunContext.user_id (or the explicit kwarg) and stores it.
  • GDPR ops are scoped. memory.forget(user_id="alice") deletes only Alice’s data. memory.export(user_id="alice") returns only Alice’s data.
  • The anonymous bucket. When user_id=None, data lands in a reserved sentinel bucket (__jeeves_anon_user__). It’s a real partition just like any other; you can forget(user_id=None) for the anonymous data. Attempting to use the sentinel as a real user_id raises ValueError. Defense against impersonation.

Bounded in-process state

StandardBudget._by_user and InMemoryMemory._blocks hold per-user state in-process. Without bounds, a runaway tenant or one-shot user_id explosion grows the dict until OOM. Both default to bounded state with LRU + idle-TTL eviction:

from loomflow import InMemoryMemory from loomflow.governance.budget import BudgetConfig, StandardBudget # Defaults: 100k users, 24h idle TTL. budget = StandardBudget(BudgetConfig()) memory = InMemoryMemory() # Tune: budget = StandardBudget( BudgetConfig(), max_users=10_000, user_idle_ttl_seconds=3_600, ) # Disable bounding (single-tenant or fixed N tenants): budget = StandardBudget(BudgetConfig(), max_users=None, user_idle_ttl_seconds=None)

See Bounded in-process state.

Verifying isolation under load

bench/multi_tenant.py simulates N concurrent users × M turns through one shared Agent and asserts:

  • p50 / p99 latency stays bounded.
  • RSS growth stays linear in tenants.
  • Zero isolation violations. Alice’s data never surfaces in Bob’s recall.
  • Zero budget mismatches. Per-user totals match the sum of charged calls.
python bench/multi_tenant.py --users 500 --turns 5

A smoke-test variant runs in CI as tests/test_multi_tenant_load.py. See Load testing.

Cross-tenant traps to avoid

Even though the framework partitions automatically, two patterns defeat it:

Bypassing user_id in a custom Memory

If you write a custom Memory impl, accept user_id= in every method and use it as the partition key. The framework forwards it on every call. Skipping it means cross-tenant leaks.

Caching across tenants in tools

# WRONG — single dict, no partition cache = {} @tool async def expensive_query(q: str) -> str: if q in cache: return cache[q] cache[q] = await api.query(q) return cache[q]

If the cache key doesn’t include user_id, two tenants see each other’s responses. Use a (user_id, q) key:

from loomflow import get_run_context @tool async def expensive_query(q: str) -> str: ctx = get_run_context() key = (ctx.user_id, q) if key in cache: return cache[key] cache[key] = await api.query(q, user_id=ctx.user_id) return cache[key]

user_id is a typed primitive, not a free-form string. The framework treats it consistently across all primitives. But if you plumb tenant scope through your own structures (caches, DB queries, external API calls), be deliberate about including user_id in every key. The framework can’t help with cross-tenant traps in your application code.

Read more

Last updated on