Voicebox is known as a reliable partner across the automotive industry, having worked successfully to bring voice recognition (VR) and natural language understanding (NLU) systems to drivers in many countries who speak many languages.
Over the past decade, those at Voicebox have worked with automakers such as Toyota, Lexus, Chrysler, Fiat, Dodge, Renault, Subaru, Daimler, and Mazda. In that time, Voicebox staff members have shipped dozens of automotive products, from in-dash embedded VR and NLU solutions, to after-market navigation devices.
The past decade has seen rapid innovation in automotive interfaces. Voicebox’s history in the industry continues to be one of “firsts,” as we push available technology to do things people never thought possible.
Voicebox delivered the first in-vehicle NLU system for the 2009 model-year Lexus. We shipped the first connected-car systems for 2011’s Toyota Entune and Lexus Enform systems. We created the first true hybrid VR/NLU system in production, blending on-board and cloud-based VR and NLU for connected cars, and we pioneered the first conversational automotive system capable of over-the-air (OTA) updates.
More recently, Voicebox experts created a template system for building conversational automotive systems more rapidly than ever and added machine learning capabilities to hybrid systems, pushing the driver experience to ever-greater levels of accuracy.
To accelerate the next generation of conversational automotive interfaces, those at Voicebox are announcing a new product solution called Voicebox Auto.
Voicebox Auto joins two other offerings: Voicebox Cloud and Voicebox Edge. Cloud-based systems offer far greater flexibility and scope, while embedded systems have lower latency because they have no network delays. Automotive interfaces need both.
Voicebox Cloud is a robust, cloud-based conversational AI platform that handles all the difficult, underlying challenges of voice interaction—tuning of speech recognition, semantic parsing and entity recognition, building the machine learning models that understand linguistic context and identify user requests, and much more—allowing customers to focus on building their unique conversational experience. Voicebox Edge is a software development kit for creating embedded conversational interfaces and connecting them to cloud services. Together, they give automakers everything needed to build their own next generational conversational interfaces.
Voicebox Auto includes:
A third key ingredient in Voicebox Auto is Voicebox Professional Services. Where other platforms provide one-size-fits-all voice features and leave it at that, Voicebox also offers its lengthy experience in the automotive industry through company’s team of consulting engineers. These engineers leverage their automotive experience to tailor Cloud and Edge for the scenarios automakers face, assisting them in bringing their solutions to market rapidly and at high quality.
Voicebox Auto natively supports hybrid operation, in which user requests can be serviced by the vehicle’s embedded system and cloud-based conversational AI when connected or by the vehicle alone when connectivity is not available.
Hybrid-mode operation provides another benefit: the “hybrid lift” effect. Because each system is distinct, they react slightly differently to user requests. Usually both systems have the right answer, but sometimes the embedded system finds the right answer when the cloud system does not, or vice-versa. An arbitrator module, powered by the latest machine-learning technology, selects which response to present to the user. The overall system thus outperforms the accuracy of either system alone, generally offering a five to ten percent boost in accuracy.
Data mining and analytics are useless without data for them to consume. Collecting large quantities of high-reliability data about what’s happening inside vehicles and how drivers use their cars has always been an obstacle for auto-makers. Now, the combination of mobile connectivity and conversational interfaces enables Voicebox Auto’s Voice Trace feature to report on dozens of environmental and performance metrics from the processing of the audio in Edge and from the car itself.
For the first time, automakers have access to real-world, real-time data collection from the car. Voicebox uses this data to improve vehicles’ conversational experiences, while automakers will use it to gain better understanding of the total vehicle environment and improve the comfort and safety of future designs. No other conversational AI platform offers this revolutionary capability to the industry.
Voicebox Auto is based on what are now fifth-generation products, with strengths including standalone or fully-hybrid operation, over-the-air updates, broad language support, dialogue support, context sensitivity, multi-modal I/O, and more.
Yet, the past decade’s rapid change and innovation in automotive interfaces shows no signs of stopping. To meet the demands of tomorrow’s scenarios, Voicebox Auto is following an ambitious roadmap of improvements outlined below.
Today’s vehicle systems rely heavily on embedded operation. As mobile connectivity becomes increasingly robust and ubiquitous, connected car systems will shift towards cloud operation for fast, accurate, robust results. Voicebox’s plan will bring the latency and scalability of Voicebox Auto to a level needed to meet the forecasted demand of billions of utterances per year.
Concurrently, Voicebox will leverage increasingly-powerful vehicle systems to provide an “Intelligent Edge” to the cloud, further improving latency and the user experience through advanced caching, predictive analytics, and OTA features.
Conversational Intelligences are a suite of new capabilities for the next generation of natural language understanding. These Intelligences combine semantic parsers built for natural conversations, plan-based reasoning capabilities to enable a true digital-assistant experience, and data sources and services for carrying out intelligent requests. For Voicebox Auto, Conversational Intelligences will be tailored to automotive scenarios, leaving only a small fraction of custom engineering work for automakers.
“Location Intelligence” will be the first conversational intelligence for Voicebox Auto. With Location Intelligence, Voicebox Auto will be able to handle complex queries such as “get me directions to the pizza place across from the movie theater on Cherry Street,” as well as to intelligently understand that requests such as “take me to the Seahawks game” actually mean “find somewhere I can park within easy walking distance of the stadium, and get me directions to that.”
Today’s voice recognition is worlds better than systems of a decade ago, achieving excellent performance in controlled environments. But a vehicle moving in traffic—subject to road noise, radios, and engine noise, as well as the particular echo patterns created by the vehicle’s cabin—is not a controlled environment.
Voicebox’s speech recognizer has been trained on hundreds of thousands of recordings manually collected in vehicles, and does quite well despite these sources of interference. Moving forward, Voice Trace will give Voicebox vastly more real-world training data. With that data, we will train voice recognition systems that are impervious to vehicle noise, accurate across a wide range of speaker accents, and tuned for the particular echo patterns of different cabin sizes.
Starting from Voicebox’s first embedded automotive system in 2008, a relentless series of enhancements has yielded a reliable, mature platform. Voicebox Auto makes that platform easier for car makers to use than ever before, while its roadmap offers car makers the security of knowing that the platform will meet tomorrow’s needs as well as today’s.