See responses as they happen
Don't make users stare at a loading spinner. Stream tokens in real-time with async generators — one parameter to enable.
Live comparison
The difference is instant
Watch the same request side by side. Without streaming, you wait. With streaming, you see every token as it arrives.
How it works
Three steps to real-time responses
User sends a message
Your app calls agent.run() with stream=True. The agent starts processing immediately.
Tokens stream back
The AI provider sends tokens one by one. Veska yields each through an async generator.
App displays instantly
Your frontend renders each token as it arrives. Users see the response building in real-time.
Benefits
Why streaming matters
Faster perceived speed
Users see the first token in milliseconds instead of waiting seconds for the full response.
Works with tools
Streaming continues to work when agents use tools mid-response. No interruption.
One parameter
Enable streaming by passing stream=True in run(). No config changes, no infrastructure setup.
SSE-ready
Wrap the async generator in Server-Sent Events for real-time web apps. Works with any frontend.
Architecture
Tokens flow from provider to user
When streaming is enabled, the AI provider sends tokens one at a time. Veska yields each token through an async generator. Your app renders them instantly — no buffering, no waiting.
Start streaming today
Enable real-time responses with a single parameter.