Question 1

Does it work for non-native English speakers?

Accepted Answer

Modern STT and TTS providers (Deepgram, ElevenLabs, OpenAI) support 30+ languages with high accuracy. Accented English is usually recognized close to native quality, with some loss in noisy environments. Multilingual coverage is bounded by what your TTS model supports.

Question 2

Are conversations recorded?

Accepted Answer

Yes — audio and transcript are recorded; structured scoring is impossible otherwise. At Intrvio, recordings are encrypted, candidates go through explicit consent (GDPR / EU AI Act), and retention is configurable.

Question 3

What about deaf or hard-of-hearing candidates?

Accepted Answer

Voice should never be the only option. A text or written-response mode is part of the reasonable accommodation duty under the EU AI Act and ADA-style regulation, so accessible alternatives must be offered.

Question 4

What latency feels acceptable?

Accepted Answer

End-to-end response under ~800ms is the threshold where the conversation stops feeling like a chatbot. The dominant cost is LLM generation; STT and TTS streaming are mostly solved at this point.

Voice AI Interviewer

Quick definition

How it works

Why it matters

Related terms

Frequently asked

Try a voice interview with GAIA on your own role.