Response Delay
Control how quickly Myra responds after a caller finishes speaking. Tune endpointing and linear delay to prevent interruptions without making the conversation feel slow.
Default endpointing
250 ms
Default linear delay
400 ms
Recommended total
400-800 ms
What is Response Delay?
Response Delay controls how quickly Myra responds after a caller finishes speaking. It determines the amount of time Vomyra waits before generating a response after detecting that the caller has stopped speaking.
Choosing the correct delay is critical. Too low, and callers may be interrupted. Too high, and conversations feel slow and robotic. Most agents perform best between 400 and 800 ms of total response delay.
How voice response timing works
Speech-to-Text
80–150 ms
Transcribes caller's spoken words into text
Endpointing
150–400 ms
Detects end-of-turn; waits configured silence window
LLM Inference
200–500 ms
Language model processes context and generates reply
Text-to-Speech
80–200 ms
Synthesises response audio and streams to caller
Response Delay controls the configurable waiting period after caller speech stops: endpointing plus any added linear delay. Reducing it lowers perceived latency; increasing it reduces false turn-ends.
Configurable settings
Endpointing
Voice Activity Detection (VAD)
The duration of silence the transcriber waits before declaring a caller's turn complete. Once this window expires without detected speech, the STT engine emits an end-of-utterance signal and the LLM begins generating.
Typical range
150 – 400 ms
Recommended default: 200–250 ms
Linear Delay
Incremental response buffer
An additional fixed delay injected after endpointing fires, before the LLM starts. This buffer accounts for callers who naturally pause mid-sentence before completing a thought — reducing premature responses.
Typical range
300 – 600 ms
Recommended default: 400–450 ms
Important
Latency benchmarks
| Threshold | Perception | Guidance |
|---|---|---|
| < 500 ms | Imperceptible | Target for real-time sales and high-engagement outbound calls. |
| 500–800 ms | Natural | Acceptable for most inbound support and booking agents. |
| 800–1200 ms | Noticeable | Acceptable for complex queries; caller may perceive a brief pause. |
| > 1200 ms | Disruptive | Exceeds human conversational tolerance; investigate pipeline bottlenecks. |