Skip to Content

TreeOfThoughts

from loomflow.architecture import TreeOfThoughts, ThoughtNode

Branching exploration with per-node evaluation. Yao et al. 2023 , Tree of Thoughts: Deliberate Problem Solving with Large Language Models.

For the conceptual page see Tree of Thoughts.


Class signature

class TreeOfThoughts: name: str = "tree-of-thoughts" def __init__( self, *, branch_factor: int = 3, max_depth: int = 3, beam_width: int = 2, solved_threshold: float = 1.0, min_score: float = 0.0, parallel: bool = True, proposer_prompt: str | None = None, evaluator_prompt: str | None = None, ) -> None: ...

Constructor parameters

branch_factor

Typeint
Default3

Candidate “thoughts” the proposer generates per frontier node, per depth. Higher = wider search; expensive. Must be >= 1.

max_depth

Typeint
Default3

Maximum depth of the search tree. Root is depth 0; the tree grows to max_depth levels. Must be >= 1. The “best leaf wins” output selection picks the highest-scoring node at any depth.

beam_width

Typeint
Default2

Top-k candidates kept per depth. Higher = wider beam; more LLM calls. Must be >= 1. With branch_factor=3, beam_width=2, max_depth=3, expect ~2 × (3 + 1) × 3 = 24 model calls before pruning + early-exit.

solved_threshold

Typefloat
Default1.0

Score in [0.0, 1.0] at or above which a candidate triggers early exit. The matching candidate becomes the final answer. Default 1.0 disables early-exit (only the proposer’s max-confidence “this is the solution” matches).

min_score

Typefloat
Default0.0

Score floor in [0.0, 1.0] below which a candidate is dropped regardless of beam capacity. Lets bad branches die quickly instead of riding along just because the beam has room. 0.0 reproduces legacy behavior (no floor); typical production values are 0.20.4.

parallel

Typebool
DefaultTrue

When True, proposer + evaluator calls within a level run concurrently via anyio.create_task_group. Pure speedup , branch_factor × beam_width independent calls become wall-clock parallel instead of sequential. Set to False for deterministic test ordering or when your provider has tight rate limits.

proposer_prompt

Typestr | None
DefaultNone (uses built-in default)

Override the proposer’s system prompt. The default asks for branch_factor candidate next thoughts toward solving the problem.

evaluator_prompt

Typestr | None
DefaultNone (uses built-in default)

Override the evaluator’s system prompt. The default asks for a score in [0.0, 1.0] representing how promising the candidate branch is.


Methods

declared_workers

Returns {}. Single-agent.

run

  1. Root. Depth-0 node containing the problem. No model call.
  2. For each depth up to max_depth: a. Expand. For every frontier node, the proposer generates branch_factor candidates (parallel across nodes when parallel=True). b. Evaluate. The evaluator scores each candidate in [0.0, 1.0]. c. Prune. Drop candidates with score < min_score. Keep the top beam_width by score as the next frontier. d. Early exit. If any candidate scores >= solved_threshold, emit tot.solved and use that branch.
  3. Best leaf wins. The highest-scoring node across the whole tree becomes the final answer.

Per-depth events: tot.depth_started, tot.expanded, tot.scored, tot.pruned, tot.frontier_updated. Optional final events: tot.solved or tot.best_leaf.


ThoughtNode

class ThoughtNode(BaseModel): id: str parent_id: str | None content: str score: float depth: int

Tree node. Stored on session.metadata["nodes"] after the run for introspection. The full tree is preserved (not just the surviving beam) so you can analyze pruning decisions.


Cost model

Per depth: frontier_size × (branch_factor + 1) model calls (proposer × branch_factor + evaluator × branch_factor, roughly batched depending on prompt design). Total across full depth: O(beam_width × max_depth × (branch_factor + 1)).

Real numbers for branch_factor=3, beam_width=2, max_depth=4: ~30 LLM calls vs ~5 for ReAct. Reserve for tasks where ReAct’s straight-line trajectory commits too early and produces wrong answers.


Example

from loomflow import Agent from loomflow.architecture import TreeOfThoughts agent = Agent( "Solve the puzzle by exploring multiple lines of reasoning.", model="claude-opus-4-7", architecture=TreeOfThoughts( branch_factor=3, beam_width=2, max_depth=4, solved_threshold=0.95, min_score=0.3, ), ) result = await agent.run("Game of 24 with [3, 7, 8, 1].")

Or by string:

agent = Agent("...", model="...", architecture="tree-of-thoughts")

Source

loomflow/architecture/tree_of_thoughts.py

Not for general task automation. ToT is 5–10× ReAct cost. Don’t default to it. Reach for it when ReAct’s straight-line trajectory commits too early and produces wrong answers in your domain (puzzles, math, planning, constraint satisfaction).

Last updated on