Meta released a new speech-to-text model that can translate nearly 100 languages called SeamlessM4T, as the company continues to try to make a universal translator.
Meta releases multilingual speech translation model


SeamlessM4T, which stands for Massively Multilingual and Multimodal Machine Translation, that the company said can translate speech-to-text and text-to-text for nearly 100 languages. For speech-to-speech and text-to-speech actions, it recognizes 100 input languages and converts them into 35 output languages.
It is released under a Creative Commons CC BY-NC 4.0 license, allowing researchers to iterate upon it.
Along with SeamlessM4T, Meta also released the metadata for its open translation dataset SeamlessAlign.
“Building a universal language translator, like the fictional Babel Fish in The Hitchhiker’s Guide to the Galaxy, is challenging because existing speech-to-speech and speech-to-text systems only cover a small fraction of the world’s languages,” Meta said.
The Hitchhiker’s Guide Babel Fish, as conceived by author Douglas Adams, is a fish you can place in your ear to instantly understand any language. If you’re a Doctor Who fan, you could compare Meta’s tool to a translation matrix in the TARDIS that turns even alien words into English.
Meta said SeamlessM4T represents “a significant breakthrough” because this new model performs the entire translation task in one go, unlike other large translation models that divide translation across different systems.
One of the interesting features of SeamlessM4T, if it can function correctly, is its alleged ability to recognize when a speaker is code-switching or when someone moves between two or more languages in one sentence. For instance, Meta demonstrated in a video that the model immediately differentiates between Hindi, Telugu, and English. I haven’t tested the model, but I frequently code-switch between my two native languages (Filipino and English) — as do most people who speak different languages — and from personal experience, it’s not something most AI speech recognition software picks up on quickly.
SeamlessM4T builds on previous translation models from Meta. Last year, Meta released its No Language Left Behind text-to-text machine translation model, which supported 200 languages. It developed SpeechMatrix, a dataset for multilingual speech-to-speech translation and Massively Multilingual Speech for speech recognition. Meta demoed its Universal Speech Translator last year, converting spoken Hokkien, a widely used language in China that does not have an official writing system, to English.
Language translation is important for companies like Meta, which employ thousands of people to moderate a flood of Facebook and Instagram posts in different languages. Very often, non-major languages have smaller teams and end up relying on automated moderation that works poorly with those languages. AI, if given access to a dataset of these smaller languages, can be a tool for companies like Meta to improve moderation.
To build SeamlessM4T, Meta said it redesigned its Fairseq sequence modeling toolkit to create more lightweight models and handle more information.
While developing SeamlessM4T, Meta said it built a system that identifies toxic or sensitive words. Meta defines toxic words as instances where the “translation may incite hate, violence, profanity, or abuse.” The goal is to be able to detect when the output translation introduces toxicity that wasn’t present in the original material.
“We filtered unbalanced toxicity in training data. If input or output contained different amounts of toxicity, we removed that training sequence,” Meta said.
Researchers also tried to clean up datasets that mistranslate some profanity so it more accurately detects when it is being used.
Meta claims it also recognizes gender bias in languages and said the model can quantify gender bias in translations. SeamlessM4T can check if the sentence used a gendered form of a word, say doctora in Spanish, and assign a female pronoun in a target language without equivalently gendered grammar if needed. Approaching it similarly to toxicity, Meta said SeamlessM4T counts how many times a translation adds gendered words into terms that were not specifically gendered in the original language, i.e., automatically assuming doctor is male when it has no gender distinction in the English language.
Meta has been releasing many of its AI models to developers and researchers in a more or less open-source fashion. It recently put out AudioCraft, code that allows for text-to-sound generation. Meta also provided access to its large language model Llama 2.
Meta released a new speech-to-text model that can translate nearly 100 languages called SeamlessM4T, as the company continues to try to make a universal translator. SeamlessM4T, which stands for Massively Multilingual and Multimodal Machine Translation, that the company said can translate speech-to-text and text-to-text for nearly 100 languages. For speech-to-speech…
Recent Posts
- Fortnite’s new season has heists, pickles, and Cowboy Bebop
- The best microSD cards in 2025
- I tried this new online AI agent, and I can’t believe how good Convergence AI’s Proxy 1.0 is at completing multiple online tasks simultaneously
- I cannot describe how strange Elon Musk’s CPAC appearance was
- Over a million clinical records exposed in data breach
Archives
- February 2025
- January 2025
- December 2024
- November 2024
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- January 2024
- December 2023
- November 2023
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023
- May 2023
- April 2023
- March 2023
- February 2023
- January 2023
- December 2022
- November 2022
- October 2022
- September 2022
- August 2022
- July 2022
- June 2022
- May 2022
- April 2022
- March 2022
- February 2022
- January 2022
- December 2021
- November 2021
- October 2021
- September 2021
- August 2021
- July 2021
- June 2021
- May 2021
- April 2021
- March 2021
- February 2021
- January 2021
- December 2020
- November 2020
- October 2020
- September 2020
- August 2020
- July 2020
- June 2020
- May 2020
- April 2020
- March 2020
- February 2020
- January 2020
- December 2019
- November 2019
- September 2018
- October 2017
- December 2011
- August 2010