Self-Hosted Endpoints

Wayland Core provides two paths for local and self-hosted inference: the openai-compatible catch-all adapter for any server that speaks the OpenAI /chat/completions wire format, and the ollama plugin for Ollama’s native NDJSON API. A third path, the data-driven provider catalog, extends the built-in provider list without requiring a hand-written match arm.

The openai-compatible provider

wcore-providers/src/openai_compatible.rs implements OpenAICompatibleProvider, a thin wrapper over OpenAIProvider. The only meaningful difference from the named providers is that you must supply an explicit base_url; there is no default. Registering with an empty or whitespace-only base_url is rejected at startup with a RegistryError::EmptyId so the misconfiguration surfaces before any request is sent.

The provider is registered under the lowercase id "openai-compatible".

The no-key sentinel

Some self-hosted servers do not require authentication. The underlying OpenAIProvider always sends an Authorization: Bearer <key> header, but servers that do not authenticate simply ignore it. Pass the string "no-key" as the API key in those cases:

[providers.my-local]
provider = "openai-compatible"
base_url = "http://localhost:8080/v1"
api_key  = "no-key"

Or from the CLI:

wayland-core \
  --provider openai-compatible \
  --base-url http://localhost:8080/v1 \
  --api-key no-key \
  "Summarize this file"

vLLM

vLLM exposes an OpenAI-compatible server at /v1 by default. Point base_url at it:

[providers.vllm]
provider = "openai-compatible"
base_url = "http://localhost:8000/v1"
api_key  = "no-key"
model    = "meta-llama/Llama-3-8b-instruct"

# Start vLLM (example):
vllm serve meta-llama/Llama-3-8b-instruct --port 8000

# Use it:
wayland-core \
  --provider openai-compatible \
  --base-url http://localhost:8000/v1 \
  --api-key no-key \
  --model meta-llama/Llama-3-8b-instruct \
  "Refactor this function"

llama.cpp server

llama.cpp’s --server mode serves OpenAI-compatible completions at /v1:

[providers.llamacpp]
provider = "openai-compatible"
base_url = "http://localhost:8081/v1"
api_key  = "no-key"
model    = "local"

# Start llama.cpp (example):
./server -m model.gguf --port 8081

# Use it:
wayland-core \
  --provider openai-compatible \
  --base-url http://localhost:8081/v1 \
  --api-key no-key \
  --model local \
  "Write a test for this function"

LM Studio

LM Studio’s local server listens on port 1234 by default and serves the OpenAI chat completions format:

[providers.lmstudio]
provider = "openai-compatible"
base_url = "http://localhost:1234/v1"
api_key  = "no-key"

wayland-core \
  --provider openai-compatible \
  --base-url http://localhost:1234/v1 \
  --api-key no-key \
  --model "<model-name-from-lm-studio>" \
  "Review this PR"

Ollama

The ollama plugin crate (wayland-ollama) implements LlmProvider over Ollama’s native POST /api/chat NDJSON endpoint. It is not a wrapper over OpenAIProvider.

Invoke it with the ollama: model prefix:

wayland-core --model ollama:llama3
wayland-core --model ollama:mistral
wayland-core --model ollama:codestral

The plugin connects to http://localhost:11434 by default. To point it at a remote Ollama instance, set base_url in a provider block:

[providers.my-ollama]
provider = "ollama"
base_url = "http://192.168.1.10:11434"
model    = "llama3"

Data-driven provider catalog

Beyond the ~20 hardcoded ProviderType arms, wcore-config/src/catalog.rs loads a bundled TOML table (data/providers.toml, compiled into the binary with include_str!) of additional OpenAI-compatible providers. The catalog currently holds 75 or more entries (the test suite asserts catalog.len() >= 75).

Each catalog entry has four required fields:

Field	Description
`id`	CLI id for `--provider <id>`. Must be unique.
`base_url`	OpenAI-compatible REST root (no trailing slash).
`env_var`	Environment variable holding the API key.
`api_path`	Optional. Path suffix appended to `base_url` for the chat completions endpoint. `None` defaults to `/v1/chat/completions`.

wcore-providers/src/catalog.rs:register_catalog wires each entry as an OpenAIProvider factory, skipping any id already claimed by a native ProviderType arm. The ProviderCompat for each catalog entry is derived from ProviderCompat::from_catalog_entry, which stamps the entry id as the provider_type for cost attribution and applies the api_path override.

One confirmed catalog entry is novita-ai (https://api.novita.ai/openai, NOVITA_API_KEY). Use --provider novita-ai to reach it. Other catalog entries follow the same pattern; consult the bundled data/providers.toml for the full list.

API key resolution for catalog providers

The resolution order for a catalog-backed provider (mirrors native providers):

--api-key CLI flag.
[providers.<name>].api_key in the config file.
The entry’s env_var read from the process environment.

A non-empty --base-url CLI flag or base_url in the config overrides the catalog entry’s URL.

ProviderCompat for self-hosted endpoints

The compat sub-table in a [providers.<name>] block lets you adjust per-provider wire behavior without touching the engine. Useful keys for self-hosted servers that diverge from strict OpenAI semantics:

[providers.my-local]
provider = "openai-compatible"
base_url = "http://localhost:8080/v1"
api_key  = "no-key"
compat.max_tokens_field         = "max_tokens"       # some servers use max_completion_tokens
compat.merge_assistant_messages = true               # merge consecutive assistant turns
compat.sanitize_schema          = true               # strip additionalProperties from tool schemas
compat.ensure_alternation       = true               # enforce user/assistant alternation

All compat.* keys are optional. Unset fields inherit the openai-compatible preset defaults.