Extended Thinking
Extended thinking gives agents a budget-controlled scratchpad for chain-of-thought reasoning before producing their final answer. This improves quality on complex tasks.
Enabling thinking
Pass a thinking config when creating the agent:
thinking.py
from veska import Agent
agent = Agent(
name="analyst",
system_prompt="You analyze complex problems step by step.",
model="claude-sonnet-4-6",
thinking={
"enabled": True,
"budget_tokens": 10000, # Max tokens for thinking
"output": "discard", # What to do with thinking text
},
)Output modes
| Mode | Behavior |
|---|---|
discard | Thinking text is used internally but not returned (default) |
log | Thinking text is logged via the Logger system |
expose | Thinking text is returned in the response for debugging |
Exposing thinking
expose.py
agent = Agent(
name="debugger",
system_prompt="You debug code issues.",
model="claude-sonnet-4-6",
thinking={"enabled": True, "budget_tokens": 15000, "output": "expose"},
)
result = agent.run("Why does this function return None?")
# result.output contains the final answer
# The thinking process was used internally to reason through the problemStreaming with thinking
When streaming, thinking chunks arrive as thinking_delta events:
stream_thinking.py
def on_token(text):
print(text, end="")
result = agent.run("Solve this math problem", stream=on_token)ThinkingHandler
handler.py
from veska import ThinkingHandler
handler = ThinkingHandler(
enabled=True,
budget_tokens=10000,
output="log",
)
config = handler.get_config() # ThinkingConfig for the provider
handler.get_history() # Past thinking entries
handler.clear_history() # Clear thinking historyProvider support
Extended thinking is supported on compatible Claude models. Use provider.supports_thinking() to check if the current model supports it.