Перейти к содержимому

Providers

Это содержимое пока не доступно на вашем языке.

PRX supports 14 LLM providers, each implementing a common Provider trait. The trait abstracts away differences in API formats, authentication, streaming, and tool calling, presenting a uniform interface to the Router and the rest of the system.

ProviderModelsAuthNotes
AnthropicClaude Opus, Sonnet, HaikuAPI key, OAuth (auto-refresh)Primary provider; OAuth token refresh is automatic
OpenAIGPT-4o, GPT-4.1, o1, o3, o4-miniAPI keyFull function calling support
OpenAI Codexcodex-miniAPI keyCode-specialized; tool use via Responses API
Google GeminiGemini 2.5 Pro/FlashAPI keyNative function calling
DashScope / QwenQwen-Max, Qwen-Plus, Qwen-TurboAPI keyAlibaba Cloud; compatible API
OllamaAny GGUF modelLocal (no key)Local inference; no tool calling
OpenRouterAny model on OpenRouterAPI keyAggregator; routing across 100+ models
AWS BedrockClaude, Titan, LlamaIAM credentialsSigV4 signing; enterprise deployment
GitHub CopilotGPT-4o, ClaudeCopilot tokenReuses VS Code / CLI Copilot auth
GLM / ZhipuGLM-4, GLM-4VAPI keyChinese market; vision support
xAIGrokAPI keyOpenAI-compatible API
LiteLLMAny model behind LiteLLM proxyAPI key or localUnified proxy; useful for custom deployments
vLLMAny model served by vLLMLocal endpointHigh-throughput local inference
HuggingFaceInference API modelsAPI tokenHuggingFace Inference Endpoints

LLM providers differ in how they handle tool/function calling. PRX normalizes this through two modes:

Providers that support structured tool calling natively (Anthropic, OpenAI, Google Gemini, etc.) receive tool definitions as part of the API request. The provider returns structured tool-use blocks that PRX parses and executes directly.

For providers without native tool support (Ollama, some vLLM models), PRX injects tool definitions into the system prompt along with instructions for the model to emit tool calls in a structured text format. PRX then parses the model output to extract tool invocations.

┌──────────────────────────────────┐
│ Tool Call Flow │
│ │
│ Tools defined ──┬── Native ──── Provider API (structured)
│ │
│ └── PromptGuided ── System prompt injection
│ ── Output parsing
└──────────────────────────────────┘

This abstraction means every provider can participate in agentic tool loops, regardless of native support.

Every provider is wrapped in a ReliableProvider that adds resilience:

Failed requests are retried with exponential backoff. The wrapper classifies errors to determine retry behavior:

Error ClassRetryBehavior
Rate limited (429)YesRespects Retry-After header; exponential backoff
Server error (5xx)YesUp to 3 retries with jitter
Auth error (401/403)NoFails immediately; triggers token refresh for OAuth providers
TimeoutYesRetries with extended timeout
Context length exceededNoFails immediately; caller should truncate

When a provider is exhausted (all retries failed), the ReliableProvider falls back to the next provider in a configured chain:

[router.fallback]
chain = ["anthropic/claude-sonnet-4-20250514", "openai/gpt-4o", "google/gemini-2.5-pro"]

The Router tries each provider/model pair in order. If the primary is down or rate-limited, the request transparently moves to the next option.

Within a single provider, model-level fallback is also supported:

[providers.anthropic]
models = ["claude-sonnet-4-20250514", "claude-haiku-4-20250414"]
fallback_order = ["claude-sonnet-4-20250514", "claude-haiku-4-20250414"]

If the preferred model is unavailable, PRX downgrades to the next model at the same provider before attempting a cross-provider fallback.

[providers.anthropic]
enabled = true
api_key = "sk-ant-..."
# Or use OAuth (token auto-refreshes)
# oauth_client_id = "..."
# oauth_client_secret = "..."
default_model = "claude-sonnet-4-20250514"
[providers.openai]
enabled = true
api_key = "sk-..."
default_model = "gpt-4o"
[providers.ollama]
enabled = true
base_url = "http://localhost:11434"
default_model = "llama3.1:70b"
tool_mode = "prompt_guided" # No native tool calling

Each provider entry specifies credentials, a default model, and optional overrides for tool calling mode, timeout, and retry limits.