API Reference

Endpoints for text-to-speech, speech retrieval, speech-to-text, and account management.

Authentication

Most endpoints expect your Speech7 API key. Send it as apiKey in the request body or use the x-api-key header (or ?apiKey=... query) for fetch requests that do not accept JSON bodies.

Admin endpoints require an admin login in the web UI or an admin API key (set isAdmin=true when creating the account).

Capabilities depend on provider keys saved on the account. Use GET https://app.speech7.com/account/capabilities with your API key to see if TTS and STT are configured.

POST /tts

Generate speech from text. Streams an MP3 response. Requires a TTS provider key saved on the account.

POST https://app.speech7.com/tts
Content-Type: application/json
{
  "apiKey": "your_api_key",
  "text": "Hello from Speech7"
}

Limits: min 500 chars / 20 words, max 500 words.

Add "json": true to return JSON metadata instead of streaming audio (fields: audioPath and id).

Quick test: curl -o out.mp3 -X POST https://app.speech7.com/tts -H "Content-Type: application/json" -d '{"apiKey":"YOUR_KEY","text":"Hello world from the API."}'

GET /speeches

List recent speeches for your API key (newest first).

GET https://app.speech7.com/speeches?limit=50
Headers: x-api-key: your_api_key

Returns speeches array with id, timestamps, usage stats, and audioPath. Use the IDs with GET /speeches/:speechId.

Quick test: curl -H "x-api-key: YOUR_KEY" https://app.speech7.com/speeches

GET /speeches/:speechId

Fetch details for a specific speech. Add ?download=1 to stream the stored MP3.

GET https://app.speech7.com/speeches/file001-123456?apiKey=your_api_key

Quick test: use an ID from /speeches then curl -L "https://app.speech7.com/speeches/ID?download=1&apiKey=YOUR_KEY" -o speech.mp3

POST /stt

Transcribe audio to text using the configured speech-to-text provider. Requires an STT provider key saved on the account. STT usage consumes minutes from the same pool as TTS (quantized to 0.5-minute increments). Responses include text, speakers, and transcript entries.

Speaker diarization is disabled by default; enable it by sending enableSpeakerDiarization=true. You can also override speaker labels by passing speaker1Label, speaker2Label, speaker3Label, etc. in the form data (e.g., speaker1Label=John). These labels are used in both the diarized text and the speakers array in the response.

Add json=true to get the structured transcript array: [{ transcript, text, durationMs }].

curl -X POST https://app.speech7.com/stt \
  -H "x-api-key: YOUR_API_KEY" \
  -F "audio=@sample.wav"

Quick test (text only): curl -X POST https://app.speech7.com/stt -F "audio=@sample.mp3" | jq .text