Read full article here: SpeechTechMag
Semantic parsing and similar techniques can revolutionize natural language processing, according to Phil Cohen, chief scientist for artificial intelligence at Voicebox, a provider of voice technology for the automotive, mobile, home, and internet-connected devices markets. However, acquiring training data for each domain in each supported language remains cumbersome and expensive. This limitation has become increasingly significant as companies demand intelligent, conversational voice applications to support their global product strategies.
Late last week, Voicebox’s Advanced Technologies Team announced a significant innovation that reduces the burden of data collection. More important, the innovation helps overcome the challenge of parsing multilanguage utterances, known in linguistic circles as code-switching.
Voicebox’s approach applies the learning of one language to another. The team evaluated utterances in German, English, and a mix of both in a single utterance. They developed a neural network model, trained on both English- and German-only sentences. As part of a single semantic parsing process, this model transfers information from one language to the other, thus leveraging English data to reduce the amount of data needed for German.
As a result, performance on each language improved. The team also evaluated utterances that could contain a mix of both languages and weren’t in any of the training data. Results were similarly impressive.
“We’re excited about the ground-breaking benefits this innovation brings to users,” Cohen says.
To date, voice navigation systems, for example, have struggled with multilingual use cases. This has been a serious usability issue in multilanguage regions, such as Europe or Asia, where code-switching is common. For example, a French person driving through Germany might ask a question in French. A person who is multi-lingual might ask a question or phrase a sentence in a mix of languages, using the language that best describes what he or she is discussing.
“My grandmother did this constantly,” Cohen says. “She said, ‘It sounds better this way.'”
Being able to offer multiple language speakers a multilanguage speech recognition solution in connected cars has been a problem for automotive companies ever since the first inception of connected cars at the end of the last decade, according to Cohen.
But the Voicebox development is only the first step in providing a true multilinguistic speech recognition system, he adds. A large company specializing in voice recognition would need to develop a comprehensive voice recognition system that then the auto manufacturers would need to add to their systems.
The new voice recognition systems come out every two years, according to Cohen. The new 2017 versions are out now. So the next versions will be out in 2019, with the next generation after that out in 2021.
Cohen says it is possible for a multilanguage voice recognition system to be ready for the 2019 models, and hinted that a company like Voicebox could develop it. While he wouldn’t go as far to say that customers were demanding such a system, he says that once such a system is available, customers will come to expect they have a car equipped with that capability.