Tree of Thoughts
Branching exploration with per-node evaluation. Yao et al. 2023 , Tree of Thoughts: Deliberate Problem Solving with Large Language Models.
Useful for combinatorial reasoning, multi-step planning, math (Game of 24), puzzle solving. Anywhere a single straight-shot ReAct trajectory would commit too early.
Pattern (BFS beam search)
proposer (×branch_factor) evaluator
prompt ──► [t1, t2, t3] ──score──► [0.9, 0.4, 0.7]
│
keep top beam_width
drop below min_score
│
▼
[t1, t3] ←── frontier for depth 2
│
(repeat to max_depth)
│
▼
best leaf wins- Root is the problem statement.
- For each level up to
max_depth: a. Expand. For every frontier node, the proposer generatesbranch_factorcandidate “thoughts” (next steps toward a solution). b. Evaluate. The evaluator scores each candidate 0–1 (how promising is this branch?). c. Prune. Keep only the topbeam_widthscored candidates. d. Early exit. If any candidate scores>= solved_threshold, stop early and use that branch. - Best leaf wins. The highest-scoring leaf across the whole tree becomes the final answer.
Usage
from loomflow import Agent
from loomflow.architecture import TreeOfThoughts
agent = Agent(
"Solve the puzzle by exploring multiple lines of reasoning.",
model="claude-opus-4-7",
architecture=TreeOfThoughts(
branch_factor=3, # candidates per node
beam_width=2, # keep top-2 each depth
max_depth=4,
solved_threshold=0.95, # early-exit if any candidate hits this
min_score=0.3, # drop branches below this
),
)
result = await agent.run("Game of 24 with [3, 7, 8, 1].")Or by string:
agent = Agent("...", model="...", architecture="tree-of-thoughts")Cost
- LLM calls per depth: roughly
frontier_size × (branch_factor + 1). Propose + evaluate. - Total: at full depth,
O(branch_factor^depth)proposals before pruning, thenbeam_width × max_depthafter. - Real numbers: branch=3, beam=2, depth=4 → ~30 LLM calls vs ~5 for ReAct. Reserve for tasks where ReAct’s first commit is wrong.
When ToT pays off
- Math, puzzles, planning. Game of 24, crosswords, mini Sudoku, constraint satisfaction.
- Combinatorial generation. Story branches, counterfactual exploration.
- Backtracking-friendly problems. Situations where committing to the wrong first move is hard to recover from in ReAct.
Not for general task automation. ToT is 5–10× ReAct cost. Don’t default to it. Reach for it when ReAct’s straight-line trajectory commits too early and produces wrong answers in your domain.
Last updated on