Error Recovery

Veska has a 3-level error recovery system that automatically handles failures, from simple retries to multi-agent discussion rooms.

Three severity levels

Level 1 — Retry

Simple retry. The same operation is attempted again.

Level 2 — Agent Fix

The agent gets the error context and tries to fix it itself.

Level 3 — Discussion Room

Multiple agents collaborate to diagnose and fix the problem.

Auto-escalation

Errors automatically escalate based on retry count:

RetriesSeverity
0-1Level 1 (Retry)
2Level 2 (Agent Fix)
3+Level 3 (Discussion Room)

Error categories

TOOL_FAILURECODE_ERRORDEPENDENCY_ERRORTIMEOUTPROVIDER_ERRORPERMISSION_ERRORCROSS_AGENTUNKNOWN

ErrorDetector

detector.py
from veska import ErrorDetector

detector = ErrorDetector()

# Detect and classify an error
error = detector.detect(
    agent_name="backend",
    message="Connection refused on port 5432",
    task_id="task_001",
    retry_count=0,
)

print(error.category)    # ErrorCategory.DEPENDENCY_ERROR
print(error.severity)    # ErrorSeverity.LEVEL_1

# Resolve an error
detector.resolve(error.id, "Restarted database connection")

# Query errors
unresolved = detector.get_unresolved(agent_name="backend")
count = detector.get_error_count("task_001")
print(detector.stats)    # {total, resolved, unresolved, by_severity, by_category}

Fix reports

At Level 2+, agents generate fix reports describing the problem and solution:

python
FixReport(
    id: str,                    # fix_XXXXXXXX
    error_id: str,              # The error being fixed
    from_agent: str,            # Agent that found the problem
    to_agent: str,              # Agent that needs to fix it
    problem: str,               # Description of the issue
    suggestion: str = "",       # Proposed fix
    affected_files: list[str],  # Files involved
    status: str = "pending",    # pending → accepted → completed
)