Documentation
OS-level memory management for LLM context.
Quick Start (Cloud)
Add your VC token as a query parameter to the base URL. Your provider API key stays in the header as usual.
Anthropic SDK
import anthropic
client = anthropic.Anthropic(
base_url="https://anthropic.virtual-context.com?vckey=vc-YOUR_KEY"
# api_key uses your normal Anthropic key
)OpenAI SDK
from openai import OpenAI
client = OpenAI(
base_url="https://openai.virtual-context.com/v1?vckey=vc-YOUR_KEY",
# api_key uses your normal OpenAI key
)curl
curl https://anthropic.virtual-context.com/v1/messages?vckey=vc-YOUR_KEY \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "content-type: application/json" \
-d '{"model":"claude-sonnet-4-20250514","max_tokens":1024,"messages":[{"role":"user","content":"Hello"}]}'The ?vckey= parameter identifies your VC tenant. Your target adjusted context window is sent to the LLM provider.
Self-Hosted Install
# Full install with all extras
pip install "virtual-context[all]"
# Or minimal + specific extras
pip install virtual-context # core only (pyyaml + httpx)
pip install "virtual-context[bridge]" # HTTP proxy
pip install "virtual-context[mcp]" # MCP server
pip install "virtual-context[embeddings]" # sentence-transformers
pip install "virtual-context[tui]" # interactive chat
pip install "virtual-context[ingest]" # document ingestion (PDF, DOCX, XLSX)Python 3.11+ required. Runs on macOS, Linux, WSL. Setup wizard:
virtual-context onboard --wizardHTTP Proxy
The proxy sits between any LLM client and the upstream provider. It auto-detects Anthropic, OpenAI, and Gemini request formats.
# Start proxy (defaults to localhost:8021)
virtual-context proxy --upstream https://api.anthropic.com
# Multi-instance: separate ports per provider
virtual-context proxy \
--upstream https://api.anthropic.com \
--port 8021 \
--config ./anthropic.yaml &
virtual-context proxy \
--upstream https://api.openai.com \
--port 8022 \
--config ./openai.yaml &Session continuity: sessions resume across proxy restarts via SQLite persistence. Tool interception: large tool results (git diff, test output) are auto-truncated to head+tail and indexed into FTS5 for full-text search.
MCP Server
Works with Claude Desktop, Cursor, and any MCP-compatible client.
# Add to claude_desktop_config.json
{
"mcpServers": {
"virtual-context": {
"command": "virtual-context",
"args": ["mcp"]
}
}
}Tools: recall_context, recall_all, find_quote, query_facts, expand_topic, collapse_topic
Resources: virtual-context://status, virtual-context://domains
Prompts: summarize_session, recall_topic
Python SDK
from virtual_context import VirtualContextEngine
engine = VirtualContextEngine(config_path="./virtual-context.yaml")
# Before sending to LLM: enrich with memory
assembled = engine.on_message_inbound(user_message, history)
# After LLM response: tag, segment, compact
report = engine.on_turn_complete(full_history)
# Ingest external documents
engine.ingest_document("meeting-notes.pdf")CLI Reference
| Command | Description |
|---|---|
virtual-context proxy | Start the HTTP proxy (main integration point) |
virtual-context mcp | Start the MCP server |
virtual-context tui | Interactive terminal chat |
virtual-context status | Show session state, tags, storage stats |
virtual-context recall | Retrieve context for a query |
virtual-context compact | Force compaction of current session |
virtual-context domains | List all tags/domains with metadata |
virtual-context retrieve | Tag + fetch summaries for a query |
virtual-context transform | Tag + fetch + assemble (full pipeline) |
virtual-context aliases | Show detected tag aliases |
virtual-context ingest | Ingest a document (PDF, DOCX, XLSX, TXT) |
virtual-context onboard | Interactive setup wizard |
Configuration
Config is loaded from virtual-context.yaml in the working directory, or passed via --config flag.
# virtual-context.yaml
storage:
backend: sqlite
path: ./vc-store.db
compaction:
watermark: 0.7 # compact when 70% of budget used
protected_turns: 6 # never compact the last N turns
budget_tokens: 128000
tagging:
provider: ollama # or openai, anthropic
model: nomic-embed-text # embedding model for inbound tagger
max_tags: 5
retrieval:
tools: true # expose vc_* tools to the LLM
max_context_tokens: 32000Provider Routing (Cloud)
Use subdomains to route to different LLM providers. Append ?vckey=vc-YOUR_KEY to any URL below.
| Provider | Base URL |
|---|---|
| Anthropic | https://anthropic.virtual-context.com |
| OpenAI | https://openai.virtual-context.com/v1 |
| Gemini | https://gemini.virtual-context.com |
| Groq | https://groq.virtual-context.com |
| Mistral | https://mistral.virtual-context.com |
| Together | https://together.virtual-context.com |
| OpenRouter | https://openrouter.virtual-context.com |
The ?vckey= parameter identifies your VC tenant. Your target adjusted context window is sent to the LLM provider.