Assistants
Create Assistant
Creates an assistant owned by the API key's user. Only `name` is required — all other fields are optional and fall back to the same platform defaults used by the dashboard. Use GET /api/v1/catalog to discover valid provider and model values before calling this endpoint.
Request
/api/v1/assistantsx-api-keyorAuthorization: BearerBody Parameters
| Name | Type | Description |
|---|---|---|
namerequired | string | Human-readable assistant name (max 200 chars). |
system_prompt | string | Instructions that shape assistant behavior (max 100 KB). |
welcome_message | string | First message spoken when a call starts (max 10 KB). |
ai_provider | string | LLM provider. Use GET /api/v1/catalog for current options. openaigroqvomyraxai |
model | string | Model identifier (e.g. "gpt-4.1-mini"). Use GET /api/v1/catalog to list valid ids per provider. |
max_tokens | integer | Maximum tokens the LLM may generate per turn. Range 1–32768. Optional — default: 256. |
temperature | number | LLM sampling temperature controlling response randomness. Range 0–2. Optional — default: 0.3. |
voice_provider | string | Text-to-speech provider. azureelevenlabscartesiaopenaivomyraxaimistral |
voice.name | string | Provider-specific voice identifier. Use GET /api/v1/catalog to list available voices. |
voice.speed | number | Speech rate multiplier. Range 0.1–4.0. Optional — default: 1.0. |
voice.stability | number | Voice consistency (ElevenLabs only). Range 0.0–1.0. Optional — default: 0.75. |
voice.similarity_boost | number | Voice similarity boost (ElevenLabs only). Range 0.0–1.0. Optional — default: 0.8. |
voice.language | string | Voice locale hint sent to the TTS provider (e.g. "hi-IN"). |
voice.tts_model | string | Provider-specific TTS model id (e.g. "eleven_flash_v2_5" for ElevenLabs). Optional. |
voice.instructions | string | Style or accent hint sent to the TTS provider (max 500 chars). Optional — default: "Indian Accent". |
transcription_provider | string | Speech-to-text provider. azuredeepgramopenaigladiacartesiagroqmistralsmallest_ai |
transcription_language | string | string[] | Language code (e.g. "hi-IN") or an array of codes when using multiple-language mode. |
language_selection_mode | string | How the STT provider handles multiple languages. Applies to Azure transcription. Optional — default: "single". singlemultiple |
transcription_prompt | string | Context hint sent to the STT provider to improve recognition accuracy (max 10 KB). Optional. |
deepgram.model | string | Deepgram model. Optional — default: "nova-2". |
deepgram.utterance_end_ms | integer | Silence gap (ms) Deepgram waits before marking an utterance as complete. Range 0–10000. Optional — default: 1200. |
deepgram.endpointing | integer | VAD endpointing latency (ms). Range 0–5000. Optional — default: 300. |
deepgram.vad_events | boolean | Emit voice-activity-detection events. Optional — default: true. |
deepgram.diarize | boolean | Enable speaker diarization. Optional — default: true. |
cartesia.model | string | Cartesia STT model. Optional — default: "ink-whisper". |
cartesia.min_volume | number | Minimum audio volume that triggers voice activity detection. Range 0–1. Optional — default: 0.3. |
cartesia.max_silence_duration_secs | number | Maximum silence (seconds) before end-of-utterance is signalled. Range 0–30. Optional — default: 2.0. |
gladia.model | string | Gladia transcription model. Optional — default: "fast". |
gladia.languages | string[] | Expected language codes for Gladia multi-language recognition. |
gladia.main_language | string | Primary language sent to Gladia. Optional — default: "en". |
smallest_ai.diarize | boolean | Enable speaker diarization. Optional — default: false. |
smallest_ai.redact_pii | boolean | Redact personally identifiable information from transcripts. Optional — default: false. |
smallest_ai.emotion | boolean | Detect caller emotion from audio. Optional — default: false. |
smallest_ai.word_timestamps | boolean | Return per-word timestamps in the transcript. Optional — default: false. |
maintain_context | boolean | Preserve conversation context across turns. Optional — default: false. |
maximum_duration | integer | Hard cap on call length in seconds. Range 1–7200. Optional — default: 600. |
silence_timeout | integer | Seconds of caller silence before the inactivity message is played. Range 1–300. Optional — default: 12. |
inactivity_message | string | Message spoken when silence_timeout elapses. Optional — default: "Are you still there?". |
timeout_end_message | string | Message spoken when the call is ended by the timeout. Optional — default: "Thank you for calling. Goodbye!". |
filler_words_enabled | boolean | Inject filler words ("hmm", "okay", etc.) while the LLM is generating to reduce perceived latency. Optional — default: true. |
filler_words | string | Comma-separated filler words to inject. Leave empty to use the platform defaults. Optional — default: "". |
dynamic_welcome_enabled | boolean | Use the Handlebars welcome message template instead of the static welcome_message. Optional — default: false. |
dynamic_welcome_message | string | Handlebars template for the dynamic greeting (e.g. "Hello {{name}}"). Optional — default: "Hello {{name}}". |
Request Body Example
{
"name": "Support Bot",
"system_prompt": "You are a helpful support assistant. Keep replies short and accurate.",
"welcome_message": "Hi, how can I help you today?",
"ai_provider": "openai",
"model": "gpt-4.1-mini",
"max_tokens": 256,
"temperature": 0.3,
"voice_provider": "azure",
"voice": {
"name": "hi-IN-AartiNeural",
"speed": 1,
"language": "hi-IN",
"instructions": "Indian Accent"
},
"transcription_provider": "azure",
"transcription_language": "hi-IN",
"language_selection_mode": "single",
"maintain_context": false,
"maximum_duration": 600,
"silence_timeout": 12,
"filler_words_enabled": true
}Notes
All fields except name are optional — omitted fields inherit Vomyra platform defaults.
Provider-specific transcription settings (deepgram, cartesia, gladia, smallest_ai) are stored but only applied when the matching transcription_provider is active.
Content-Type: application/json is required on the request.
Response Example
{
"success": true,
"data": {
"id": "665f1a2b3c4d5e6f7a8b9c0d",
"name": "Sales Bot",
"system_prompt": "You are a concise sales assistant.",
"welcome_message": "Hi! How can I help?",
"ai_provider": "openai",
"model": "gpt-4.1-mini",
"max_tokens": 256,
"temperature": 0.3,
"voice": {
"provider": "azure",
"name": "hi-IN-AartiNeural",
"speed": 1,
"language": "hi-IN",
"stability": 0.75,
"similarity_boost": 0.8,
"tts_model": null,
"instructions": "Indian Accent"
},
"transcription": {
"provider": "azure",
"language": "hi-IN",
"mode": "single",
"prompt": null,
"deepgram": {
"model": "nova-2",
"utterance_end_ms": 1200,
"endpointing": 300,
"vad_events": true,
"diarize": true
},
"cartesia": {
"model": "ink-whisper",
"min_volume": 0.3,
"max_silence_duration_secs": 2
},
"gladia": {
"model": "fast",
"languages": [],
"main_language": "en"
},
"smallest_ai": {
"diarize": false,
"redact_pii": false,
"emotion": false,
"word_timestamps": false
}
},
"maintain_context": false,
"maximum_duration": 600,
"silence_timeout": 12,
"inactivity_message": "Are you still there?",
"timeout_end_message": "Thank you for calling. Goodbye!",
"filler_words_enabled": true,
"filler_words": "",
"dynamic_welcome_enabled": false,
"dynamic_welcome_message": "Hello {{name}}",
"selected_tools": [
"66aa2b3c4d5e6f7a8b9c0d11"
],
"created_at": "2026-05-01T10:00:00.000Z",
"updated_at": "2026-05-02T08:30:00.000Z"
}
}Status Codes
| HTTP | Meaning | Description |
|---|---|---|
| 201 | Created | Resource was created. |
| 401 | Unauthorized | Missing, invalid, inactive, or origin-restricted API key. |
| 415 | Unsupported media type | Content-Type header is missing or is not application/json. |
| 422 | Validation error | JSON body failed schema or business validation. |
| 429 | Rate limited | Per-IP or per-key request budget was exceeded. |