LLM Proxy
OpenAI-compatible API for multiple LLM providers.
Overview
Artemis proxies requests to multiple LLM providers through a unified OpenAI-compatible API. This means you can use the OpenAI SDK to access Claude, Gemini, Perplexity, and more.
Base URL: https://artemis.meetrhea.com/v1
Endpoints
Chat Completions
POST /v1/chat/completions
Standard OpenAI chat completions API. Supports streaming.
Request:
{
"model": "claude-sonnet-4-20250514",
"messages": [
{"role": "system", "content": "You are helpful."},
{"role": "user", "content": "Hello!"}
],
"stream": true,
"max_tokens": 1024
}
Response (non-streaming):
{
"id": "chatcmpl-...",
"object": "chat.completion",
"created": 1234567890,
"model": "claude-sonnet-4-20250514",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 12,
"completion_tokens": 8,
"total_tokens": 20
}
}
Models
GET /v1/models
Lists available models from all configured providers.
Audio Transcriptions
POST /v1/audio/transcriptions
See Whisper for details.
Model Routing
The model name determines which provider handles the request:
| Model Prefix | Provider |
|---|---|
claude-* | Anthropic |
gpt-*, o1-*, o3-* | OpenAI |
gemini-* | |
sonar-* | Perplexity |
openrouter/* | OpenRouter |
Examples
# Anthropic Claude
model="claude-sonnet-4-20250514"
# OpenAI GPT-4
model="gpt-4o"
# Google Gemini
model="gemini-2.0-flash"
# Perplexity (web search)
model="sonar-pro"
# OpenRouter (any model)
model="openrouter/meta-llama/llama-3.1-405b-instruct"
Authentication
API Key
Include your Artemis API key in the Authorization header:
Authorization: Bearer art_xxxxxxxxxxxx
Provider Key Selection
Artemis automatically selects the appropriate provider key based on:
- API key overrides (if configured)
- Default key for the provider in your group
- First active key if no default
Streaming
Artemis supports Server-Sent Events (SSE) streaming:
from openai import OpenAI
client = OpenAI(
api_key="your-artemis-key",
base_url="https://artemis.meetrhea.com/v1"
)
stream = client.chat.completions.create(
model="claude-sonnet-4-20250514",
messages=[{"role": "user", "content": "Tell me a story"}],
stream=True
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="")
Request Tracking
Every request is logged with:
- Request ID (returned in
X-Request-IDheader) - Timestamps (start, end, duration)
- Token counts (prompt, completion)
- Cost calculation
- Model and provider used
Custom Tracking
Add metadata to requests for your own tracking:
{
"model": "claude-sonnet-4-20250514",
"messages": [...],
"user": "user-123",
"metadata": {
"session_id": "sess-abc",
"feature": "chat"
}
}
Error Handling
Artemis returns standard OpenAI-style errors:
{
"error": {
"message": "No provider key found for anthropic",
"type": "configuration_error",
"param": null,
"code": "no_provider_key"
}
}
Common Errors
| Code | Meaning |
|---|---|
invalid_api_key | Artemis API key is invalid |
no_provider_key | No provider key configured for this provider |
provider_error | Upstream provider returned an error |
rate_limit | Rate limited by provider |
Provider-Specific Headers
Some providers require specific headers. Artemis handles these automatically:
Anthropic
x-api-key: <key>
anthropic-version: 2023-06-01
OpenAI
Authorization: Bearer <key>
Google
?key=<key> (query parameter)
Usage with SDKs
Python (OpenAI)
from openai import OpenAI
client = OpenAI(
api_key="art_xxxx",
base_url="https://artemis.meetrhea.com/v1"
)
Python (Anthropic)
from anthropic import Anthropic
# Use OpenAI SDK instead for Artemis
# Anthropic SDK talks directly to Anthropic API
JavaScript/TypeScript
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'art_xxxx',
baseURL: 'https://artemis.meetrhea.com/v1'
});
curl
curl https://artemis.meetrhea.com/v1/chat/completions \
-H "Authorization: Bearer art_xxxx" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-20250514",
"messages": [{"role": "user", "content": "Hello"}]
}'
Rate Limits
Artemis does not add its own rate limits. You're subject to the rate limits of the underlying providers based on your API keys.
Timeouts
Default timeout is 120 seconds. For long-running requests, use streaming.