Contenox Cloud

Contenox stays Apache 2.0.
Cloud is what you reach for when you don't want to run the infra.

The local runtime connects to any provider — Ollama, OpenAI, Gemini, Vertex, vLLM. When you'd rather not operate GPU nodes or skill servers yourself, Contenox Cloud slots in as a provider config line. Same plans, same workflow, different backend.

Layer 1 — Hosted inference

Contenox Cloud runs vLLM on demand. You use it exactly like any other model provider — add it to your backend list and the runtime routes inference there when you need it.

Demand-pull pricing — you pay for tokens you use, not idle capacity
No GPU node to provision, patch, or restart
Switch back to a local or third-party provider in one config change
Same model selection semantics as self-hosted vLLM

This makes sense when a team reaches the point where running a spot A100 is cheaper than paying per-token to a major API — but doesn't want to operate the infra themselves.

Layer 2 — Hosted skills

Skills are HTTP services your plans can call — vector search, OCR, embeddings, document analytics. Running them yourself requires managing Vald clusters, ingestion pipelines, and worker processes. Hosted skills expose them as endpoints your local runtime already knows how to reach.

Vector search over your documents without a self-hosted Vald cluster
OCR and document extraction as a callable tool in any plan
Embeddings pipeline — index once, query from any Contenox instance
Analytics: usage, latency, and plan execution traces

You can self-host any of these if you prefer. Hosted skills exist for teams that want the capability without the operational overhead.

Layer 3 — Managed Bob

Bob is the enterprise multi-user control plane — JWT/RBAC, RAG, bots, an event system, a JS sandbox, an OpenAI-compatible API, and background Python workers. It's designed to be self-hosted on your infrastructure, but we can operate it as a managed service for teams where that matters.

Multi-user teams with roles and permissions — not just one developer's machine
RAG pipeline with Vald — retrieval over your own documents and data
Bots: Telegram, GitHub, and custom integrations that react to real events
OpenAI-compatible API — drop-in for tools that already speak OpenAI
On-prem or managed — your security boundary determines the deployment model

Early access

We're onboarding design partners for Contenox Cloud and managed Bob.

Each pilot is scoped individually — reach out and we'll figure out what makes sense for your team.

Request early access

See the full pricing breakdown on the Pricing page.

The runtime is free.Cloud adds what's hard to self-host.

Layer 1 — Hosted inference

Layer 2 — Hosted skills

Layer 3 — Managed Bob

The runtime is free.
Cloud adds what's hard to self-host.