Configurer les fallbacks de providers

Cette page fait partie du guide pratique francophone consacré à Hermes Agent. Elle répond à l'intention de recherche : prévoir les modèles de secours.

Le contenu s'appuie sur la documentation officielle Hermes Agent associée à cette page. L'objectif n'est pas de remplacer la documentation de Nous Research, mais de fournir une lecture claire en français, structurée pour aller vite, avec un maillage logique vers les pages complémentaires du même site.

À retenir

Sujet principal : hermes agent fallback provider.
Type de page : spoke.
Cluster : providers.
Source canonique : documentation officielle Hermes Agent.
Aucun lien vers l'autre domaine n'est utilisé dans cette page.

Quand utiliser cette page

Utilisez cette page quand vous voulez prévoir les modèles de secours. Elle part du principe que Hermes Agent est déjà identifié comme l'outil à mettre en place ou à comprendre, puis détaille les points importants issus de la documentation officielle.

Si vous découvrez seulement l'outil, revenez d'abord au hub parent puis suivez les liens internes proposés en fin de page.

Base officielle

Hermes Agent has three layers of resilience that keep your sessions running when providers hit issues:

Credential pools — rotate across multiple API keys for the *same* provider (tried first)
Primary model fallback — automatically switches to a *different* provider:model when your main model fails
Auxiliary task fallback — independent provider resolution for side tasks like vision, compression, and web extraction

Credential pools handle same-provider rotation (e.g., multiple OpenRouter keys). This page covers cross-provider fallback. Both are optional and work independently.

Primary Model Fallback

When your main LLM provider encounters errors — rate limits, server overload, auth failures, connection drops — Hermes can automatically switch to a backup provider:model pair mid-session without losing your conversation.

Configuration

The easiest path is the interactive manager:

hermes fallback

hermes fallback reuses the provider picker from hermes model — same provider list, same credential prompts, same validation. Use the subcommands add, list (alias ls), remove (alias rm), and clear to manage the chain. Changes persist under the top-level fallback_providers: list in config.yaml.

If you'd rather edit the YAML directly, add a top-level fallback_providers list to ~/.hermes/config.yaml:

fallback_providers:
  - provider: openrouter
    model: anthropic/claude-sonnet-4

Each entry requires both provider and model. Entries missing either field are ignored.

fallback_providers (plural, list) is the current config shape and supports multiple fallbacks tried in order. fallback_model (singular) is the legacy single-fallback key — Hermes still honors it for back-compat, but hermes fallback writes the current fallback_providers key and migrates legacy config on write. When both are set, fallback_providers takes priority.

Supported Providers

Provider — Value — Requirements
OpenRouter — openrouter — OPENROUTER_API_KEY
Nous Portal — nous — hermes setup --portal (fresh) or hermes auth add nous (OAuth)
OpenAI Codex — openai-codex — hermes model (ChatGPT OAuth)
GitHub Copilot — copilot — COPILOT_GITHUB_TOKEN, GH_TOKEN, or GITHUB_TOKEN
GitHub Copilot ACP — copilot-acp — External process (editor integration)
Anthropic — anthropic — ANTHROPIC_API_KEY or Claude Code credentials
z.ai / GLM — zai — GLM_API_KEY
Kimi / Moonshot — kimi-coding — KIMI_API_KEY
MiniMax — minimax — MINIMAX_API_KEY
MiniMax (China) — minimax-cn — MINIMAX_CN_API_KEY
DeepSeek — deepseek — DEEPSEEK_API_KEY
NVIDIA NIM — nvidia — NVIDIA_API_KEY (optional: NVIDIA_BASE_URL)
GMI Cloud — gmi — GMI_API_KEY (optional: GMI_BASE_URL)
StepFun — stepfun — STEPFUN_API_KEY (optional: STEPFUN_BASE_URL)
Ollama Cloud — ollama-cloud — OLLAMA_API_KEY
Google AI Studio — gemini — GOOGLE_API_KEY (alias: GEMINI_API_KEY)
xAI (Grok) — xai (alias grok) — XAI_API_KEY (optional: XAI_BASE_URL)
xAI Grok OAuth (SuperGrok) — xai-oauth (alias grok-oauth) — hermes model → xAI Grok OAuth (browser login; SuperGrok subscription)
AWS Bedrock — bedrock — Standard boto3 auth (AWS_REGION + AWS_PROFILE or AWS_ACCESS_KEY_ID)
Qwen Portal (OAuth) — qwen-oauth — hermes model (Qwen Portal OAuth; optional: HERMES_QWEN_BASE_URL)
MiniMax (OAuth) — minimax-oauth — hermes model (MiniMax portal OAuth)
OpenCode Zen — opencode-zen — OPENCODE_ZEN_API_KEY
OpenCode Go — opencode-go — OPENCODE_GO_API_KEY
Kilo Code — kilocode — KILOCODE_API_KEY
Xiaomi MiMo — xiaomi — XIAOMI_API_KEY
Arcee AI — arcee — ARCEEAI_API_KEY
GMI Cloud — gmi — GMI_API_KEY
Alibaba / DashScope — alibaba — DASHSCOPE_API_KEY
Alibaba Coding Plan — alibaba-coding-plan — ALIBABA_CODING_PLAN_API_KEY (falls back to DASHSCOPE_API_KEY)
Kimi / Moonshot (China) — kimi-coding-cn — KIMI_CN_API_KEY
StepFun — stepfun — STEPFUN_API_KEY
Tencent TokenHub — tencent-tokenhub — TOKENHUB_API_KEY
Microsoft Foundry — azure-foundry — AZURE_FOUNDRY_API_KEY + AZURE_FOUNDRY_BASE_URL
LM Studio (local) — lmstudio — LM_API_KEY (or none for local) + LM_BASE_URL
Hugging Face — huggingface — HF_TOKEN
Custom endpoint — custom — base_url + key_env (see below)

Custom Endpoint Fallback

For a custom OpenAI-compatible endpoint, add base_url and optionally key_env:

fallback_providers:
  - provider: custom
    model: my-local-model
    base_url: http://localhost:8000/v1
    key_env: MY_LOCAL_KEY            # env var name containing the API key

When Fallback Triggers

The fallback activates automatically when the primary model fails with:

Rate limits (HTTP 429) — after exhausting retry attempts
Server errors (HTTP 500, 502, 503) — after exhausting retry attempts
Auth failures (HTTP 401, 403) — immediately (no point retrying)
Not found (HTTP 404) — immediately
Invalid responses — when the API returns malformed or empty responses repeatedly

When triggered, Hermes:

Resolves credentials for the fallback provider
Builds a new API client
Swaps the model, provider, and client in-place
Resets the retry counter and continues the conversation

The switch is seamless — your conversation history, tool calls, and context are preserved. The agent continues from exactly where it left off, just using a different model.

Fallback is turn-scoped: each new user message starts with the primary model restored. If the primary fails mid-turn, fallback activates for that turn only. On the next message, Hermes tries the primary again. Within a single turn, fallback activates at most once — if the fallback also fails, normal error handling takes over (retries, then error message). This prevents cascading failover loops within a turn while giving the primary model a fresh chance every turn.

Examples

OpenRouter as fallback for Anthropic native:

model:
  provider: anthropic
  default: claude-sonnet-4-6

fallback_providers:
  - provider: openrouter
    model: anthropic/claude-sonnet-4

Nous Portal as fallback for OpenRouter:

model:
  provider: openrouter
  default: anthropic/claude-opus-4

fallback_providers:
  - provider: nous
    model: nous-hermes-3

Local model as fallback for cloud:

fallback_providers:
  - provider: custom
    model: llama-3.1-70b
    base_url: http://localhost:8000/v1
    key_env: LOCAL_API_KEY

Codex OAuth as fallback:

fallback_providers:
  - provider: openai-codex
    model: gpt-5.3-codex

Where Fallback Works

Context — Fallback Supported
CLI sessions — ✔
Messaging gateway (Telegram, Discord, etc.) — ✔
Subagent delegation — ✔ (subagents inherit the parent fallback chain)
Cron jobs — ✔ (cron agents inherit configured fallback providers)
Auxiliary tasks on provider: auto — ✔ (try per-task fallback, then the main fallback chain before built-in aux discovery)

There are no environment variables for the primary fallback chain — configure it exclusively through config.yaml or hermes fallback. This is intentional: fallback configuration is a deliberate choice, not something a stale shell export should override.

---

Auxiliary Task Fallback

Hermes uses separate lightweight models for side tasks. Each task has its own provider resolution chain that acts as a built-in fallback system.

Tasks with Independent Provider Resolution

Task — What It Does — Config Key
Vision — Image analysis, browser screenshots — auxiliary.vision
Web Extract — Web page summarization — auxiliary.web_extract
Compression — Context compression summaries — auxiliary.compression
Skills Hub — Skill search and discovery — auxiliary.skills_hub
MCP — MCP helper operations — auxiliary.mcp
Approval — Smart command-approval classification — auxiliary.approval
Title Generation — Session title summaries — auxiliary.title_generation
Triage Specifier — hermes kanban specify / dashboard ✨ button — fleshes out a one-liner triage task into a real spec — auxiliary.triage_specifier

Auto-Detection Chain

When a task's provider is set to "auto" (the default), Hermes first tries the main provider + main model for that auxiliary task. If that route is unavailable or later fails with a capacity-style error, Hermes now honors user-configured fallback policy before using the built-in discovery chain:

Main provider + main model → auxiliary.<task>.fallback_chain →
fallback_providers / fallback_model → built-in auxiliary discovery chain

The task-specific chain is most precise and wins when present. The top-level fallback_providers chain is the same policy the main agent uses, so free-only or same-provider fallbac

Points de vigilance

Vérifiez toujours la version active de Hermes Agent avant d'appliquer une commande ou une configuration.
Ne collez pas de clé API dans un chat public ou dans une page visible.
Gardez les secrets dans les fichiers ou gestionnaires prévus pour cela.
Si une fonctionnalité dépend d'un provider, d'un plugin ou d'une plateforme de messagerie, vérifiez que le composant est bien activé dans votre profil.
Pour une installation de production, testez d'abord le flux complet sur une machine ou un profil isolé.

Exemple de parcours logique

Lire la page courante pour comprendre hermes agent fallback provider.
Ouvrir le hub parent du cluster providers.
Passer ensuite aux pages complémentaires proposées dans « À lire ensuite ».
Revenir à la documentation officielle si vous avez besoin du détail exact ou d'une commande récemment modifiée.

FAQ rapide

Cette page remplace-t-elle la documentation officielle ?

Non. Elle sert de guide francophone structuré. Le lien vers la source officielle est disponible en bas de page.

Les commandes sont-elles garanties à jour ?

Elles sont basées sur la documentation officielle récupérée au moment de la génération. Pour un usage critique, vérifiez toujours la page officielle liée en bas.

Pourquoi autant de liens internes ?

Hermes Agent est un système modulaire. L'installation, les providers, les outils, la mémoire, les skills, la sécurité et les plateformes se répondent. Le maillage interne aide à suivre ce chemin sans tomber sur des pages orphelines.

Configurer les fallbacks de providers