Skip to main content

Whisper Transcription

Audio transcription via OpenAI Whisper API.

Overview

Artemis proxies audio transcription requests to OpenAI's Whisper API, providing the same usage tracking and key management as LLM requests.

Endpoint: POST /v1/audio/transcriptions

Usage

Python

from openai import OpenAI

client = OpenAI(
api_key="your-artemis-key",
base_url="https://artemis.meetrhea.com/v1"
)

with open("audio.mp3", "rb") as audio_file:
transcript = client.audio.transcriptions.create(
model="whisper-1",
file=audio_file
)

print(transcript.text)

curl

curl https://artemis.meetrhea.com/v1/audio/transcriptions \
-H "Authorization: Bearer $ARTEMIS_KEY" \
-F file="@audio.mp3" \
-F model="whisper-1"

Request Parameters

ParameterTypeRequiredDescription
filefileYesAudio file to transcribe
modelstringYesModel to use (whisper-1)
languagestringNoISO-639-1 language code
promptstringNoGuide the model's style
response_formatstringNojson, text, srt, vtt, verbose_json
temperaturefloatNoSampling temperature (0-1)

Supported Formats

Audio files must be in one of these formats:

  • mp3
  • mp4
  • mpeg
  • mpga
  • m4a
  • wav
  • webm

Maximum file size: 25 MB

Response

JSON (default)

{
"text": "Hello, this is a transcription of the audio file."
}

Verbose JSON

{
"task": "transcribe",
"language": "english",
"duration": 5.5,
"text": "Hello, this is a transcription.",
"segments": [
{
"id": 0,
"start": 0.0,
"end": 2.5,
"text": "Hello, this is"
}
]
}

SRT/VTT

Subtitle formats for video use.

Usage Tracking

Whisper requests are logged with:

  • Audio duration (seconds)
  • Cost (based on duration pricing)
  • Provider key used

Pricing

OpenAI Whisper pricing: $0.006 per minute of audio.

Provider Key

Whisper uses OpenAI provider keys. Ensure you have an OpenAI key configured in your group's provider settings.

Error Handling

{
"error": {
"message": "File too large. Maximum size is 25MB.",
"type": "invalid_request_error",
"code": "file_too_large"
}
}

Common Errors

ErrorCause
file_too_largeAudio file exceeds 25MB
invalid_file_formatUnsupported audio format
no_provider_keyNo OpenAI key configured
provider_errorOpenAI API error

Integration with Speaches

For local Whisper transcription without OpenAI, use Speaches service instead.