Summarisation backends

Declare backends as [backends.<name>] entries and pick the active one with summarization.default_backend. The MCP summarize tool and the inline summarize sub-arg on fetch accept a per-call backend override, so a client can switch backends without touching the config. See Summarising pages for how modes and styles flow through.

Two kinds need no feature flag and are always available:

extractive is offline, CPU-only, pure Rust. It selects sentences from the source using the TextRank-flavoured implementation in summarizer::extractive. No network, no API key, no setup.
cloud calls a hosted LLM through the genai crate. You bring the provider and the key.

A third kind, local, runs an LLM on your own machine and requires the local-inference Cargo feature, which the stock binary doesn't ship. For kind = "local", model is a HuggingFace repo id rather than a provider model id, and provider is ignored. Setup, model sizes, and build flags live in Optional features.

The implicit default

An empty [backends] map still gives you a working summariser. With nothing declared, Rover installs an implicit default extractive backend, so a fresh install summarises offline with zero configuration.

Adding any explicit [backends.*] block turns that injection off. The validation rules then apply strictly:

summarization.default_backend must name an existing [backends.*] entry.
If summarization.fallback_to_extractive = true (the default), at least one extractive backend must exist.

So declaring a single cloud backend opts you out of the fallback unless you re-add an extractive one. That's why most configs keep a [backends.default] extractive block around.

Cloud providers

The provider string picks which hosted API the cloud kind talks to. Each native provider resolves its key from a default environment variable. openai_compat covers anything that speaks the OpenAI Chat Completions shape.

`provider`	Notes
`openai`	Native OpenAI. Default env: `OPENAI_API_KEY`.
`anthropic`	Native Anthropic. Default env: `ANTHROPIC_API_KEY`.
`gemini`	Google Gemini. Default env: `GEMINI_API_KEY`.
`xai`	xAI Grok. Default env: `XAI_API_KEY`.
`groq`	Groq. Default env: `GROQ_API_KEY`.
`deepseek`	DeepSeek. Default env: `DEEPSEEK_API_KEY`.
`together`	Together AI. Default env: `TOGETHER_API_KEY`.
`fireworks`	Fireworks AI. Default env: `FIREWORKS_API_KEY`.
`openai_compat`	Any endpoint speaking the OpenAI Chat Completions shape. Requires `base_url`.

For native providers, leave api_key_env unset and genai resolves the default env var for you. For openai_compat, api_key_env is optional. Rover hands the client a "noop" key when none is configured, which is what most local servers want anyway.

`[backends.<name>]` fields

Field	Type	Required	Description
`kind`	string	yes	`extractive`, `cloud`, or `local`. `local` requires the `local-inference` feature.
`provider`	string	yes for `cloud`	One of the provider strings above. Ignored for `local`.
`model`	string	yes for `cloud` and `local`	For `cloud`, the literal provider model id (e.g. `gpt-4o-mini`, `claude-haiku-4-5`). For `local`, a HuggingFace repo id (e.g. `Qwen/Qwen3.5-0.8B`).
`base_url`	string	yes for `openai_compat` only	Custom endpoint URL. Ignored for native providers and for `local`.
`api_key_env`	string	no	Env var holding the API key. When unset, see the provider defaults above.

The model split is the easy thing to get wrong. A cloud backend wants the string the provider's API expects; a local backend wants the repo path Rover passes to HuggingFace to download weights. Cross them and the backend either 404s at the provider or hunts for a repo that doesn't exist.

`openai_compat` base URL normalisation

Rover normalises the base_url for openai_compat so it always ends in /v1/. This applies to both [backends.*] and [captioners.*]. The normalisation is idempotent: a URL that already ends /v1/ is left alone.

Input	Normalised
`http://localhost:1234`	`http://localhost:1234/v1/`
`http://localhost:1234/`	`http://localhost:1234/v1/`
`http://localhost:1234/v1`	`http://localhost:1234/v1/`
`http://localhost:1234/v1/`	unchanged
`https://api.example.com/custom/`	`https://api.example.com/custom/v1/`
`https://api.example.com/custom/v1/`	unchanged

Whitespace around the URL is trimmed before normalisation, so a stray pasted space won't break the endpoint.

Fallback selection

A failing cloud call doesn't have to fail the request. With summarization.fallback_to_extractive = true, a cloud failure (auth, rate limit, model error, invalid request) retries against an extractive backend. The agent still gets a summary, just a cheaper one.

The fallback backend is chosen deterministically:

The backend named default, if it is extractive.
Otherwise, the lexicographically first extractive backend.

The fetch and summarize responses report the swap through the summarizer_fallback envelope: {from: "<original backend>", reason: "<stable code>"}. The reason code is stable, so a client can branch on it without scraping prose.

Worked examples

Offline only (the implicit default)

No config required. The implicit default extractive backend handles everything.

OpenAI

[backends.fast]
kind = "cloud"
provider = "openai"
model = "gpt-4o-mini"
api_key_env = "OPENAI_API_KEY"

[backends.default]
kind = "extractive"

[summarization]
default_backend = "fast"
fallback_to_extractive = true

Anthropic

[backends.claude]
kind = "cloud"
provider = "anthropic"
model = "claude-haiku-4-5"
api_key_env = "ANTHROPIC_API_KEY"

[backends.default]
kind = "extractive"

[summarization]
default_backend = "claude"

LM Studio (local, OpenAI-compatible)

[backends.lm_studio]
kind = "cloud"
provider = "openai_compat"
base_url = "http://localhost:1234"        # auto-normalised to /v1/
model = "qwen2.5-0.5b-instruct"

[backends.default]
kind = "extractive"

[summarization]
default_backend = "lm_studio"

Ollama (local, OpenAI-compatible)

[backends.ollama]
kind = "cloud"
provider = "openai_compat"
base_url = "http://localhost:11434"       # normalised to http://localhost:11434/v1/
model = "llama3.2:1b"

[backends.default]
kind = "extractive"

[summarization]
default_backend = "ollama"

Multi-backend with named alternatives

[backends.fast]
kind = "cloud"
provider = "openai"
model = "gpt-4o-mini"

[backends.deep]
kind = "cloud"
provider = "anthropic"
model = "claude-sonnet-4-5"

[backends.local]
kind = "cloud"
provider = "openai_compat"
base_url = "http://localhost:1234"
model = "qwen2.5-0.5b-instruct"

[backends.default]
kind = "extractive"

[summarization]
default_backend = "fast"
fallback_to_extractive = true

A client overrides the backend per call:

{ "url": "https://...", "backend": "deep", "mode": "abstractive", "style": "executive" }

The full set of config keys, and where this file lives, is documented in Configuration.

Validating backend setup

Run rover doctor to confirm the wiring before an agent depends on it.

rover doctor                # checks extractive synthesis + every cloud backend authenticates
rover doctor --format ndjson

The extractive_synthesis and backends_authenticate checks run as part of the default battery. Build with local-inference and the battery adds the local-model checks, so a misconfigured local backend surfaces here instead of on the first agent call.

The implicit default​

Cloud providers​

[backends.<name>] fields​

openai_compat base URL normalisation​

Fallback selection​

Worked examples​

Offline only (the implicit default)​

OpenAI​

Anthropic​

LM Studio (local, OpenAI-compatible)​

Ollama (local, OpenAI-compatible)​

Multi-backend with named alternatives​

Validating backend setup​