Daily Management Review

As Voice Assistants Race To Cover Languages, Apple's Siri Learns Shanghainese


With Apple Inc, Amazon.com Inc, Microsoft Corp and now Alphabet Inc's Google all offering electronic assistants to take your commands, the voice-assistant wars are in full swing with the broad release of Google Assistant last week.
Apple has squandered its lead when it comes to understanding speech and answering questions, said  researchers including Oren Etzioni, chief executive officer of the Allen Institute for Artificial Intelligence in Seattle, even as Siri is the oldest of the bunch.
But possessing a very important capability in a smartphone market where most sales are outside the United States is Siri - speak 21 languages localized for 36 countries. This is at least one thing that Siri can do that the other assistants cannot.
In contrast, eight languages tailored for 13 countries is possible for Microsoft Cortana. Four languages are spoken by Google’s Assistant that began in its Pixel phone but has moved to other Android devices. English and German arethe only languages available with Amazon's Alexa. a special dialect of Wu Chinese spoken only around Shanghai - Shanghainese, will soon be learned by Siri.
If they are to become ubiquitous tools for operating smartphones and other devices, the language issue shows the type of hurdle that digital assistants still need to clear.
For any assistant, it is complicated to speak languages natively. For example, the assistant must know to say “two-nil” instead of “two-nothing,” even though the language is English, if someone asks for a soccer score in Britain.
Cortana for local markets is being customized by an editorial team of 29 people at Microsoft. For example, to stand out from other Spanish-speaking countries, a published children’s book author writes Cortana’s lines in Mexico.
“They really pride themselves on what’s truly Mexican. (Cortana) has a lot of answers that are clever and funny and have to do with what it means to be Mexican,” said Jonathan Foster, who heads the team of writers at Microsoft.
While declining to comment further, Google and Amazon said that they plan to bring more languages to their assistants.
Alex Acero, head of the speech team at Apple said that to enable the computer t have an exact representation of the spoken text to learn from, passages, which are read by humans,  in a range of accents and dialects are transcribed by hand, at Apple when the company starts working on a new language. A range of sounds in a variety of voices is also captured by Apple. And to predict words sequences, a language model is built from there.
But Charles Jolley, creator of an intelligent assistant named Ozlo, said that script-writing does not scale. “You can’t hire enough writers to come up with the system you’d need in every language. You have to synthesize the answers,” he said. That is years off, he said.
And working just that are the founders of Viv, a startup founded by Siri's original creators that Samsung acquired last year.
"Viv was built to specifically address the scaling issue for intelligent assistants," said Dag Kittlaus, the CEO and co-founder of Viv. "The only way to leapfrog today's limited functionality versions is to open the system up and let the world teach them."