Improving How Machine Translations Handle Grammatical Gender Ambiguity
Machine Translation (MT) enables people to connect with others and engage with content across language barriers. Grammatical gender presents a difficult challenge for these systems, as some languages require specificity for terms that can be ambiguous or neutral in other languages. For example, when translating the English word “nurse” into Spanish, one must decide whether the feminine “enfermera” or the masculine “enfermero” is appropriate. However, particularly when contextual clues are absent, such as in translating a single sentence, a model cannot determine which would be correct. This…
Multilingual Machine Translation promises to improve translation quality between non-English languages. This is advantageous for several reasons, namely lower latency (no need to translate twice), and reduced error cascades (e.g. , avoiding losing gender and formality information when translating through English). On the downside, adding more languages reduces model capacity…
Translating text that contains entity names is a challenging task, as cultural-related references can vary significantly across languages. These variations may also be caused by transcreation, an adaptation process that entails more than transliteration and word-for-word translation. In this paper, we address the problem of cross-cultural translation on two fronts:…
This paper was accepted at the 5th Workshop on Gender Bias in Natural Language Processing 2024. Machine translation (MT) systems often translate terms with ambiguous gender (e.g., English term “the nurse”) into the gendered form that is most prevalent in the systems’ training data (e.g., “enfermera”, the Spanish term for…