LLM Proxy

OpenAI-compatible API for multiple LLM providers.

Overview

Artemis proxies requests to multiple LLM providers through a unified OpenAI-compatible API. This means you can use the OpenAI SDK to access Claude, Gemini, Perplexity, and more.

Base URL: https://artemis.meetrhea.com/v1

Endpoints

Chat Completions

POST /v1/chat/completions

Standard OpenAI chat completions API. Supports streaming.

Request:

{
  "model": "claude-sonnet-4-20250514",
  "messages": [
    {"role": "system", "content": "You are helpful."},
    {"role": "user", "content": "Hello!"}
  ],
  "stream": true,
  "max_tokens": 1024
}

Response (non-streaming):

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1234567890,
  "model": "claude-sonnet-4-20250514",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Hello! How can I help you today?"
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 8,
    "total_tokens": 20
  }
}

Models

GET /v1/models

Lists available models from all configured providers.

Audio Transcriptions

POST /v1/audio/transcriptions

See Whisper for details.

Model Routing

The model name determines which provider handles the request:

Model Prefix	Provider
`claude-*`	Anthropic
`gpt-`, `o1-`, `o3-*`	OpenAI
`gemini-*`	Google
`sonar-*`	Perplexity
`openrouter/*`	OpenRouter

Examples

# Anthropic Claude
model="claude-sonnet-4-20250514"

# OpenAI GPT-4
model="gpt-4o"

# Google Gemini
model="gemini-2.0-flash"

# Perplexity (web search)
model="sonar-pro"

# OpenRouter (any model)
model="openrouter/meta-llama/llama-3.1-405b-instruct"

Authentication

API Key

Include your Artemis API key in the Authorization header:

Authorization: Bearer art_xxxxxxxxxxxx

Provider Key Selection

Artemis automatically selects the appropriate provider key based on:

API key overrides (if configured)
Default key for the provider in your group
First active key if no default

Streaming

Artemis supports Server-Sent Events (SSE) streaming:

from openai import OpenAI

client = OpenAI(
    api_key="your-artemis-key",
    base_url="https://artemis.meetrhea.com/v1"
)

stream = client.chat.completions.create(
    model="claude-sonnet-4-20250514",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True
)

for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")

Request Tracking

Every request is logged with:

Request ID (returned in X-Request-ID header)
Timestamps (start, end, duration)
Token counts (prompt, completion)
Cost calculation
Model and provider used

Custom Tracking

Add metadata to requests for your own tracking:

{
  "model": "claude-sonnet-4-20250514",
  "messages": [...],
  "user": "user-123",
  "metadata": {
    "session_id": "sess-abc",
    "feature": "chat"
  }
}

Error Handling

Artemis returns standard OpenAI-style errors:

{
  "error": {
    "message": "No provider key found for anthropic",
    "type": "configuration_error",
    "param": null,
    "code": "no_provider_key"
  }
}

Common Errors

Code	Meaning
`invalid_api_key`	Artemis API key is invalid
`no_provider_key`	No provider key configured for this provider
`provider_error`	Upstream provider returned an error
`rate_limit`	Rate limited by provider

Provider-Specific Headers

Some providers require specific headers. Artemis handles these automatically:

Anthropic

x-api-key: <key>
anthropic-version: 2023-06-01

OpenAI

Authorization: Bearer <key>

Google

?key=<key>  (query parameter)

Usage with SDKs

Python (OpenAI)

from openai import OpenAI

client = OpenAI(
    api_key="art_xxxx",
    base_url="https://artemis.meetrhea.com/v1"
)

Python (Anthropic)

from anthropic import Anthropic

# Use OpenAI SDK instead for Artemis
# Anthropic SDK talks directly to Anthropic API

JavaScript/TypeScript

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'art_xxxx',
  baseURL: 'https://artemis.meetrhea.com/v1'
});

curl

curl https://artemis.meetrhea.com/v1/chat/completions \
  -H "Authorization: Bearer art_xxxx" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-20250514",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Rate Limits

Artemis does not add its own rate limits. You're subject to the rate limits of the underlying providers based on your API keys.

Timeouts

Default timeout is 120 seconds. For long-running requests, use streaming.

Overview​

Endpoints​

Chat Completions​

Models​

Audio Transcriptions​

Model Routing​

Examples​

Authentication​

API Key​

Provider Key Selection​

Streaming​

Request Tracking​

Custom Tracking​

Error Handling​

Common Errors​

Provider-Specific Headers​

Anthropic​

OpenAI​

Google​

Usage with SDKs​

Python (OpenAI)​

Python (Anthropic)​

JavaScript/TypeScript​

curl​

Rate Limits​

Timeouts​