Universal Translator turns spoken English into any of 26 different languages!
REDMOND, Washington - March 13, 2012 - It has long been used by James T. Kirk to speak to aliens and blue women from space - but now Microsoft is on the brink of making a real, working Universal Translator.
Frank Soong and Rick Rashid have created software which converts English language speech into any of 26 foreign languages - and which “speaks” in the user's own voice.
All the user has to do is speak English into the machine and it will convert it into anything from Spanish to Mandarin.
The hope is that the device will one day allow visitors to foreign countries to have conversations with other people, even though they do not speak the same language - just like in Star Trek.
Soong told Technology Review that his breakthrough could help language students and might also work with navigational devices.
In theory it could one day be installed into a smart phone meaning tourists have a ready-made translation device sitting in their pockets.
Soong said, “We will be able to do quite a few scenario applications. For a monolingual speaker traveling in a foreign country, we'll do speech recognition followed by translation, followed by the final text to speech output in a different language, but still in his own voice.”
Soong and Rashid work at Microsoft’s HQ in Redmond, Washington. They created the system with colleagues at Microsoft Research Asia in Beijing, the company's second-largest research lab.
In Star Trek, it was supposedly introduced in the late 22nd century and helped the crew of the Enterprise communicate with aliens as they explored the universe.
However, Soong and Rashid have made their version today, even if the voice that comes out in the foreign language still sounds a little mechanical.
Their device needs around one hour to get used to a person’s voice; it then works by comparing the words that have been recorded with stock models for the target language.
The technology has been designed so that it does not just translate words, which would give it a computerized and disjointed sound. Instead, the sounds are carefully manipulated to mimic real speech as realistically as possible.