Hebrew + English code-switching ASR — 12-point accuracy lift on mixed-language calls
Link Voice now transcribes Hebrew/English code-switched speech with a 12pt accuracy improvement on real Israeli SMB call traffic.
The single most common failure mode on Israeli inbound calls is the mid-sentence switch from Hebrew to English. A patient calls a clinic and says "רציתי לקבוע appointment ל-Tuesday" — and most ASR systems either lose the English tokens entirely or hallucinate Hebrew transliterations of the English words. Both failure modes destroy downstream intent classification.
This release ships a code-switching aware acoustic model fine-tuned on roughly 380 hours of real Israeli call audio that our largest customers volunteered (anonymised, retention-purged after model training). Word-error-rate on our internal code-switching eval set dropped from 18.4% to 6.1% — a 12.3-point absolute improvement. End-to-end intent accuracy on production traffic improved by 7 points.
What changed
- New acoustic model variant trained on Hebrew + English code-switching corpus with explicit language-tag tokens.
- Language-ID is now per-word, not per-utterance. Mixed-language turns no longer get force-collapsed into a single language bucket.
- Faster decoder — median ASR latency is down 80ms despite the larger model, courtesy of a switch to streaming chunked decoding.
- Numeric and date entities ("Tuesday", "3pm", "שני", "שלוש") now route through a shared normaliser, so calendar booking flows behave identically regardless of which language the caller used.
Who benefits
Every Link Voice deployment that handles bilingual callers — which, in practice, is every customer. Clinics, real-estate offices, and law firms see the biggest lift because their callers code-switch most aggressively. No customer action is required; the upgrade rolled out across all production agents on 2026-05-19.
How to verify
Open any call transcript in your Voice dashboard from the last 48 hours. Mixed-language turns now carry a per-token language tag and the English fragments render in English orthography rather than Hebrew transliteration. Calendar-booking calls show the resolved entity directly underneath the transcript.