TreeOfThoughts
from loomflow.architecture import TreeOfThoughts, ThoughtNodeBranching exploration with per-node evaluation. Yao et al. 2023 , Tree of Thoughts: Deliberate Problem Solving with Large Language Models.
For the conceptual page see Tree of Thoughts.
Class signature
class TreeOfThoughts:
name: str = "tree-of-thoughts"
def __init__(
self,
*,
branch_factor: int = 3,
max_depth: int = 3,
beam_width: int = 2,
solved_threshold: float = 1.0,
min_score: float = 0.0,
parallel: bool = True,
proposer_prompt: str | None = None,
evaluator_prompt: str | None = None,
) -> None: ...Constructor parameters
branch_factor
| Type | int |
| Default | 3 |
Candidate “thoughts” the proposer generates per frontier node, per
depth. Higher = wider search; expensive. Must be >= 1.
max_depth
| Type | int |
| Default | 3 |
Maximum depth of the search tree. Root is depth 0; the tree grows
to max_depth levels. Must be >= 1. The “best leaf wins” output
selection picks the highest-scoring node at any depth.
beam_width
| Type | int |
| Default | 2 |
Top-k candidates kept per depth. Higher = wider beam; more LLM
calls. Must be >= 1. With branch_factor=3, beam_width=2, max_depth=3, expect ~2 × (3 + 1) × 3 = 24 model calls before
pruning + early-exit.
solved_threshold
| Type | float |
| Default | 1.0 |
Score in [0.0, 1.0] at or above which a candidate triggers early
exit. The matching candidate becomes the final answer. Default 1.0
disables early-exit (only the proposer’s max-confidence “this is the
solution” matches).
min_score
| Type | float |
| Default | 0.0 |
Score floor in [0.0, 1.0] below which a candidate is dropped
regardless of beam capacity. Lets bad branches die quickly
instead of riding along just because the beam has room. 0.0
reproduces legacy behavior (no floor); typical production values
are 0.2–0.4.
parallel
| Type | bool |
| Default | True |
When True, proposer + evaluator calls within a level run
concurrently via anyio.create_task_group. Pure speedup ,
branch_factor × beam_width independent calls become wall-clock
parallel instead of sequential. Set to False for deterministic
test ordering or when your provider has tight rate limits.
proposer_prompt
| Type | str | None |
| Default | None (uses built-in default) |
Override the proposer’s system prompt. The default asks for
branch_factor candidate next thoughts toward solving the problem.
evaluator_prompt
| Type | str | None |
| Default | None (uses built-in default) |
Override the evaluator’s system prompt. The default asks for a
score in [0.0, 1.0] representing how promising the candidate
branch is.
Methods
declared_workers
Returns {}. Single-agent.
run
- Root. Depth-0 node containing the problem. No model call.
- For each depth up to
max_depth: a. Expand. For every frontier node, the proposer generatesbranch_factorcandidates (parallel across nodes whenparallel=True). b. Evaluate. The evaluator scores each candidate in[0.0, 1.0]. c. Prune. Drop candidates withscore < min_score. Keep the topbeam_widthby score as the next frontier. d. Early exit. If any candidate scores>= solved_threshold, emittot.solvedand use that branch. - Best leaf wins. The highest-scoring node across the whole tree becomes the final answer.
Per-depth events: tot.depth_started, tot.expanded,
tot.scored, tot.pruned, tot.frontier_updated. Optional final
events: tot.solved or tot.best_leaf.
Related types
ThoughtNode
class ThoughtNode(BaseModel):
id: str
parent_id: str | None
content: str
score: float
depth: intTree node. Stored on session.metadata["nodes"] after the run for
introspection. The full tree is preserved (not just the surviving
beam) so you can analyze pruning decisions.
Cost model
Per depth: frontier_size × (branch_factor + 1) model calls
(proposer × branch_factor + evaluator × branch_factor, roughly
batched depending on prompt design). Total across full depth:
O(beam_width × max_depth × (branch_factor + 1)).
Real numbers for branch_factor=3, beam_width=2, max_depth=4:
~30 LLM calls vs ~5 for ReAct. Reserve for tasks where ReAct’s
straight-line trajectory commits too early and produces wrong
answers.
Example
from loomflow import Agent
from loomflow.architecture import TreeOfThoughts
agent = Agent(
"Solve the puzzle by exploring multiple lines of reasoning.",
model="claude-opus-4-7",
architecture=TreeOfThoughts(
branch_factor=3,
beam_width=2,
max_depth=4,
solved_threshold=0.95,
min_score=0.3,
),
)
result = await agent.run("Game of 24 with [3, 7, 8, 1].")Or by string:
agent = Agent("...", model="...", architecture="tree-of-thoughts")Source
loomflow/architecture/tree_of_thoughts.py
Not for general task automation. ToT is 5–10× ReAct cost. Don’t default to it. Reach for it when ReAct’s straight-line trajectory commits too early and produces wrong answers in your domain (puzzles, math, planning, constraint satisfaction).