Bilingual voice assistant PT-BR/ES-LATAM:
Conversational flow to reduce failures due to accent, code-switching, and false cognates
Authorial project where I developed a conversational flow for voice assistants focused on Spanish-speaking users using a VUI (voice user interface) configured for Brazilian Portuguese. The goal was to reduce comprehension failures in situations of code-switching, accent, and false cognates, creating a more inclusive and efficient experience for Latin American bilinguals.
01 / Sumário
01 Summary
02 Project's process
03 Context
04 Problem
05 Design principles for bilingual VUI
06 Objectives
07 My role in the Project
08 Proposed solution (conversational flow)
09 Update: modern technologies (2024–2026), how this flow evolves
10 Results and conclusions
11 Learnings
02 / Project's process
empathize
>
research
>
ideate
>
prototype
>
concept
03 / Context
I live in Brazil, and I’m Chilean on my father’s side—my daily use of voice assistants reveals a recurring problem: names of Latin artists, commands, and accents influenced by Spanish are interpreted as errors, while requests in English work without difficulty.
Real example of use:
“Play System of a Down” → works
“Play Los Prisioneros → doesn't work
Thank you.
This inconsistency impacts millions of bilinguals who naturally switch between PT-BR and ES-LATAM daily. When I carried out this project in 2022, voice assistants still treated this variation as “noise,” not as part of the experience.


Flowchart illustrating the interaction steps of a voice interface.
04 / Problem
General voice assistants (Alexa, Siri, Google Assistant) have structural limitations when recognizing bilingual speech in Portuguese-BR/Spanish-ES:
Code-switching: difficulty interpreting sentences that alternate between PT and ES.
Accent: Hispanic phonemes confuse word recognition in PT-BR.
False cognates: misinterpretations of similar words between languages.
Insufficient recovery: repeated responses of "I don't understand".
Perception of bias: users feel that the error is “theirs,” not the technology's.
Bilingual proto-personas PT-ES:
These failures cause bilinguals to give up using the VUI even for simple tasks—leading to frustration, loss of confidence, and low adherence. Given this scenario, two proto-personas were created:


Raul Gonzáles
Age: 25 years old; education: university degree.
"Tengo que repetir mis palabras. The assistant simply does not understand my accent, since it is configured in Portuguese.
"I have to repeat my words. The assistant simply doesn't understand my accent, as it's configured in Portuguese."
Biography: A Hispanic who has lived in Brazil for a short time, he uses virtual voice assistants configured in Portuguese to become more familiar with the language. He uses his work time to practice conversation with his Brazilian colleagues. Frustration: He still does not master Portuguese very well and needs to improve his fluency. Digital user, accustomed to technology.


Mariela Aguirre
Age: 40 years old; education: high school
“No se entendió porque tengo acento, por lo que no completé la tarea indicada.”
"It was not understood because I have an accent, so I did not complete the indicated task."
Biography: Hispanic who has lived in Brazil for a few years but still has a lot of difficulty with the nasalized sounds typical of the Portuguese language. Married to a Brazilian and mother of a son, she lives in a Spanish-Portuguese bilingual family because she wants the child to grow up involved with the cultures of both parents. Frustration: She has difficulty with typical Portuguese phonemes that do not exist in Spanish. The user uses the technology for small daily activities. She is not interested in the latest innovations in the market, but in products that can help her solve her problems.
05 / Design principles for bilingual VUI
Prioritize intention over literalness.
Confirm ambiguous meanings (false cognates).
Recovery always has three paths: repeat / options / change language.
Offer short keywords to get around accents.
Keep a neutral, non-punitive tone.
Detect language mix and ask for preferred language.
Incorporate Latin American repertoire (names, slang, bands).
06 / Objectives
Reduce misinterpretations
Decrease errors caused by accent, code-switching, and false cognates.


Avoid frustrating repetitions of “I don't understand,” maintaining fluidity in the conversation.
Provide clear paths for repeating, viewing options, or changing the language.
Minimize error loops
Offer useful and guided recovery






Preserve context and intention
Ensure that the assistant understands the user's main objective, even with speech variations.
Ensure linguistic inclusion
Avoid incorrect actions due to ambiguity
Treat PT-BR/ES-LATAM bilinguals as the norm,
not the exception.
Use confirmation and disambiguation to prevent misinterpretations (e.g., false cognates).




07 / My role in the project
I worked throughout the cycle, from analysis to solution design:
State-of-the-art research in speech recognition PT-BR ↔ ES-LATAM.
Investigation of technical limitations of 2022 AIs.
Construction of the conversational flow decision tree.
Definition of the principles of bilingual interaction.
Prototyping of dialogues and failure scenarios.
Heuristic evaluation (VUI + adapted Nielsen).


Flowchart illustrating the experience decision tree.
08 / Proposed solution (conversational flow)
The solution is composed of four central pillars:
Language detection and code-switching
Whenever the phrase contains PT and ES together, the assistant identifies the mix and asks for the preferred language to continue.
Example:
“Quiero ver mi pedido, mas não lembro el número.” (“I want to see my order, but I don't remember the number.”)
“Detectei português e espanhol. Quer seguir em português ou espanhol?” (“I detected Portuguese and Spanish. Would you like to continue in Portuguese or Spanish?”)
Disambiguation of false cognates
The assistant identifies ambiguous terms and offers semantic choices.
Example:
“Estoy embarazada.” ("I'm pregnant.")
“Você quis dizer grávida (espanhol) ou envergonhada (português)?” (“Did you mean pregnant (Spanish) or embarrassed (Portuguese)?”
Intent-based navigation to bypass accents
When recognition repeatedly fails, VUI offers shortcuts by keywords.
Example:
“Quero 'rastreá' meu pedido.” (“I want to 'track' my order.”)
“Você quer acompanhar entrega? Diga ‘entrega’.” (“Do you want to track delivery? Say 'delivery'.")
Respectful and non-punitive recovery
Instead of “speak correctly,” the assistant takes responsibility::
“I think I got confused here. Let's try it like this…”
Failure example
User (ES):
“Quiero ver mi pedido, mas não lembro el número.”
(“I want to see my order, but I don't remember the number.")
Assistant:
“Desculpe, não entendi. Pode repetir?”
(“Sorry, I don't understand. Can you repeat?”)
User repeats.
Assistant:
“Desculpe, não entendi.”
(“Sorry, I don't understand.”)
Problem: language mix + accent → recognized as error → frustration loop.
Example of success (with the proposed flow)
User (ES + PT):
“Quiero ver mi pedido, mas não lembro el número.”
("I want to see my order, but I can't remember the order.")
Assistant:
“Percebi português e espanhol. Quer continuar em espanhol ou português?”
(“I understand Portuguese and Spanish. Would you like to continue in Spanish or Portuguese?”)
User:
“Espanhol.”
("Spanish")
Assistant:
“Certo. Para ver seu pedido, você pode dizer ‘mis pedidos’ ou ‘último pedido’.”
("“Okay. To view your order, you can say 'my orders' or 'last order'.”)
Result: guided recovery, language defined, intention preserved.
The original project was carried out in 2022. Since then, artificial intelligence technology has evolved not only for virtual assistants but as a whole, in various areas. For conversational voice interfaces, we can highlight the following improvements:
Contextual ASR: modern models maintain short-term memory and reprocess ambiguous excerpts more accurately.
Phonetic embeddings: more robust recognition of accents and hybrid pronunciations.
Real code-switching detection: identifies languages and separates blocks within the sentence.
Multimodal intention classifiers: understand commands even with lexical flaws.
Dynamic personalization: the assistant learns the user's repertoire (band names, accent, preferences).
Adaptive tone adjustment: responses that change based on perceived frustration and intonation.
09 / Update: modern technologies (2024–2026), how this flow evolves
This makes the proposed solution even more feasible, scalable and relevant for today's voice products.
10 / Results and conclusions


Structural issue (not user's)
The limitation is structural, not behavioral: in 2022, AI relied on complete words and did not handle phonemes and bilingual context well.
11 / Learnings
Designing for bilinguals requires treating ambiguity as the rule, not the exception. The voice experience needs to be designed for linguistic diversity, with a focus on intention, recovery and clarity.
This project reinforced how UX, linguistics and AI should work in partnership to create truly inclusive experiences in voice products.
Solution is design + engineering
Confirmation + guides prevent critical error
Recovery restores confidence
The answer requires integration between UX and technology, not just “stronger” NLUs (natural language understanding).
Confirming meaning and offering guided paths reduces critical failures and avoids misinterpretations (especially with false cognates).
Bilingual users regain confidence when the assistant takes responsibility for the failure and offers clear alternatives to follow.






Fernanda Abarca, 2026
