Error Classification
TokenHub classifies provider errors to enable intelligent failover. Each error from a provider is classified into one of four categories that determine the routing engine's next action.
Error Classes
context_overflow
The request exceeds the model's context window.
Triggers:
- HTTP 413 from provider
- Response body contains
context_length_exceeded
Router action: Escalate to a model with a larger context window. If no larger model is available, try the next model in scored order.
rate_limited
The provider is throttling requests.
Triggers:
- HTTP 429 from provider
Router action: Skip to a different provider. If the response includes a Retry-After header, the delay is recorded in the classified error for optional use by the caller.
transient
A temporary server-side failure.
Triggers:
- HTTP 5xx from provider
Router action: Retry the same model with exponential backoff:
- Base delay: 100ms
- Maximum retries: 2
- Backoff multiplier: 2x (100ms, 200ms)
After retries are exhausted, try the next model.
fatal
An unrecoverable client error.
Triggers:
- HTTP 4xx (except 429 and 413)
- Any other unclassified error
Router action: Skip to the next model in scored order. No retry.
Error Flow
Provider returns error
│
├── adapter.ClassifyError(err) → ClassifiedError{Class, RetryAfter}
│
└── Router handles based on class:
├── context_overflow → Find bigger model
├── rate_limited → Different provider (respect RetryAfter)
├── transient → Retry with backoff (up to 2x)
└── fatal → Next model
ClassifiedError Type
type ClassifiedError struct {
Err error
Class ErrorClass // "context_overflow", "rate_limited", "transient", "fatal"
RetryAfter float64 // Seconds to wait (from Retry-After header, 429 only)
}
HTTP Error Responses
Consumer API Errors
| Status | Meaning | When |
|---|---|---|
| 400 | Bad Request | Invalid JSON, missing messages, validation failure |
| 401 | Unauthorized | Missing or invalid API key |
| 403 | Forbidden | Valid key but insufficient scopes |
| 502 | Bad Gateway | All models failed, no eligible models, or provider errors |
Admin API Errors
| Status | Meaning | When |
|---|---|---|
| 400 | Bad Request | Invalid parameters or validation failure |
| 404 | Not Found | Resource not found (model, key, provider) |
| 500 | Internal Server Error | Database or vault errors |
Provider-Specific Classification
OpenAI
| HTTP Status | Body Pattern | Error Class |
|---|---|---|
| 429 | — | rate_limited |
| 500-599 | — | transient |
| 400 | context_length_exceeded | context_overflow |
| Other 4xx | — | fatal |
Anthropic
| HTTP Status | Body Pattern | Error Class |
|---|---|---|
| 429 | — | rate_limited |
| 500-599 | — | transient |
| 400 | context_length_exceeded | context_overflow |
| Other 4xx | — | fatal |
vLLM
| HTTP Status | Body Pattern | Error Class |
|---|---|---|
| 429 | — | rate_limited |
| 500-599 | — | transient |
| 400 | context_length_exceeded | context_overflow |
| Other 4xx | — | fatal |
Reward Impact
Error classification affects the contextual bandit reward system:
- Successful requests: Reward computed from latency and cost
- Failed requests: Reward = 0.0 (regardless of error class)
- Error class is stored in reward entries for analysis
This ensures the Thompson Sampling policy learns to avoid unreliable models over time.