Is TR+EN code-switching common enough to plan for?

Yes — for the Turkish tech, finance, and consulting sectors, code-switching is the default register, not an exception. Candidates routinely deliver Turkish sentence frames with English technical nouns ('biz API'yi deploy ettik'), and forcing them into one language hurts the fluency signal and the candidate experience equally.

What is GAIA's word-error rate for Turkish?

Comparable published Whisper-Turkish benchmarks land between 4.3% and 14.2% raw, with LoRA fine-tuning reducing WER by up to 52% on domain-matched data.[1] GAIA uses a Whisper-family base with hiring-domain fine-tuning; in production we see single-digit WER on clean audio and degrade gracefully on accent and background noise.

Does the candidate need to declare which language they will use?

No. GAIA detects the language per utterance and routes accordingly. The candidate can start in Turkish and switch to English mid-answer; the model handles the transition. The interview is set up in TR or EN as the primary language, which only changes the prompt language and the default greeting.

Are candidates penalized for language slips?

Not when scoring is properly designed. The scoring rubric evaluates the substance of the answer, not the language it arrived in. The exception is roles where a specific language proficiency is a job requirement; in that case fluency is itself a rubric dimension and is scored explicitly, not implicitly.

How does this interact with KVKK and GDPR?

Cross-language interview data is still personal data under both KVKK (Turkey) and GDPR (EU). The lawful basis is typically legitimate interest in the recruitment process plus explicit candidate notice and consent. Audio recordings and transcripts must be stored under the same retention rules as any other interview data — six months minimum under the EU AI Act for hiring AI.

When should I start the interview in Turkish vs English?

Default to Turkish when the role is Turkey-based and the team operates primarily in Turkish, even if the JD is in English. Default to English when the role is on a global team and the working language is English. The candidate's choice should override either default if expressed.

Multilingual AI Interviews: The Turkish + English Bridge

Why TR + EN is a real interview pattern

Listen to a typical Istanbul senior backend interview and a candidate will say something like: “Microservice mimarisinde event-driven bir yaklaşım denedik, ama eventual consistency işin içine girince debugging çok zorlaştı.” Forcing that into monolingual Turkish or monolingual English would actually be less natural — there is no clean translation, only a loss. The Turkish software ecosystem operates on largely English terminology, so even when the sentence frame stays Turkish, the technical noun phrases stay English.

In linguistics this is called intra-sentential code-switching. The academic literature flags it as a hard case for multilingual ASR: even multilingual models typically process each utterance as monolingual.[2] When a candidate switches mid-utterance between Turkish and English, a naive multilingual ASR will either misroute the word or hallucinate a transcription in the wrong language.

How GAIA handles code-switching in real time

The pipeline has three stages. First: a Whisper-family base ASR model, fine-tuned on hiring-domain audio with LoRA adapters. Published results on this approach show LoRA fine-tuning reduces raw WER by up to 52% on domain-matched data.[1] Second stage: a language-aware decoder — it produces a per-token language estimate, so the transcript carries the two languages mixed instead of forcing a single-language interpretation.[3][5] Third stage: semantic scoring, where a language-agnostic LLM operates on the transcript in an embedding space that does not care whether a given token is Turkish or English.

The practical outcome: when a candidate says “biz monolith yapıdan microservice architecture'a geçtik,” the transcript preserves “monolith yapıdan microservice architecture'a” intact — not awkwardly translated. When the evaluator reads it, no information is lost; the scoring captures the intent.

ASR accuracy — strengths and weaknesses

Turkish is classified as a low-resource language in ASR research: rich morphology, agglutinative structure, vowel harmony, and a prosody pattern that is genuinely Turkish-specific. Published Whisper-Turkish benchmarks report raw WER between 4.3% and 14.2% on the base model, depending on dataset and audio quality.[1][4] Those figures are workable for most interview setups; single-digit WER is well into the range where downstream interpretation is robust.

Where GAIA is weakest: background noise, low-bandwidth mobile cellular audio, and very heavy English accents. In those cases the model usually still produces the correct sentence — but individual technical words can get misspelled (“Kafga” instead of “Kafka”). The scoring does not penalize the candidate for this, because the semantic LLM still recognises the concept; it is only the surface text that is wrong.

Scoring fairness when languages mix

The hypothetical concern — that a candidate gets penalized for dropping back into Turkish — does not actually arise on a well-designed rubric. Scoring evaluates substance, not surface language: the phrase “event-driven approach” counts the same whether it arrived inside a Turkish or English sentence frame.

The exception is roles where language proficiency is itself a job requirement (e.g. an English-language customer support line). In those roles, fluency becomes its own rubric dimension, scored explicitly. The candidate is told in advance, and the rubric shows the evaluator exactly what the score is measuring against what part of the transcript.

KVKK + GDPR

Interview audio and transcripts are personal data — under KVKK in Turkey, under GDPR in the EU. The lawful basis is typically the recruitment process's legitimate interest plus explicit candidate notice and consent. Retention: at least six months under EU AI Act Article 26(6) for hiring AI; KVKK's proportionality principle caps retention at the role's decision window.

The new dimension in TR + EN versus monolingual workflows is translation and normalization: should the transcript be kept in original mixed-language form or translated to a single language? Our recommendation: keep the original code-switched transcript as the primary record of truth, and treat AI translations only as secondary derived data for search and summarization. When a candidate asks for an Article 86 decision explanation, the original record is what counts.

Practical setup checklist

Set the primary language to the language the work is done in (TR for a TR role, EN for a global-team role).
Allow candidate preference to override the default.
Keep the opener short and explicitly welcome code-switching (“Speak in Turkish or English — whichever feels natural.”).
Document that the rubric scores substance not language; attach the rubric to the scoring transcript.
Retain logs for six months; mark the original code-switched transcript as the primary record.
When a candidate requests an explanation, surface which rubric score the decision rests on and which transcript span was used.