Browsing by Author "Kala R. Jules, Adetiba Emmanuel, Abayomi Abdultaofeek, Oluwatobi E. Dare, Ifijeh H. Ayodele"

Now showing 1 - 1 of 1

Speech to speech translation with translatotron: A state of the art review
(Elsevier B.V., 2025-10-20) Kala R. Jules, Adetiba Emmanuel, Abayomi Abdultaofeek, Oluwatobi E. Dare, Ifijeh H. Ayodele
A speech-to-speech translation using cascade-based methods has been considered a benchmark for a very long time. Still, it is plagued by many issues, like the time to translate a speech from one language to another and compound errors. These issues are because cascade-based methods use a combination of other methods, such as speech recognition, speech-to-text transcription, text-to-text translation, and finally, text-to-speech transcription. Google proposed Translatotron, a sequence-to-sequence direct speech-to-speech translation model that was designed to address the issues of compound errors associated with cascade-based models. Today, there are 3 versions of the Translatotron model: Translatotron 1, Translatotron 2, and Translatotron 3. Translatotron 1 is a proof of concept to demonstrate direct speech-to-speech translation. This first approach was found to be less effective than the cascade model, but it was producing promising results. Translatotron 2 was an improved version of Translatotron 1 with results similar to the cascade-based model. Translatotron 3, the latest version of the model, significantly improves the translation and is better than the cascade model at some points. This paper presents a complete review of speech-to-speech translation using Translatotron models. We will also show that Translatotron is the best model to bridge the language gap between African Languages and other well-formalized languages.