The hardest problem in voice interfaces is deriving the meaning of a user’s request. The limitations of these Natural Language Understanding (NLU) systems have historically limited users to relatively simple queries. Understanding fully natural language demands a radical breakthrough in how NLU systems tackle users’ queries.
Nowhere is this more evident than in the navigation domain. Current systems can understand your request for directions to a destination, but they often demand that you provide more information than is reasonable, such as the street address of your destination. Everyone knows their home address, but how many of us know the street address of our favorite restaurants, or of our local post-office, or of gas stations in an unfamiliar town?
As users, we often don’t know those details, but instead have criteria for our destination. For example:
This makes navigation the perfect domain for Voicebox researchers to test their new NLU technology based on semantic parsing. This technology handles natural but vaguely-worded queries, taking the user experience from mere navigation to true location intelligence.
Semantic parsing uses machine learning techniques, trained on large data sets, to derive logical forms from user queries. These logical forms encode the meaning of the user’s request in a structure that traditional software can readily correlate with any data sources necessary to answer the user’s request.
Logical forms capture subtle aspects of user queries that traditional systems simply can’t handle. They can also encode complex, multi-part utterances, allowing the NLU to understand requests such as:
Voicebox’s semantic parser already handles more natural, more complex queries than any other NLU system in the world, and can do it under real-world conditions. Our semantic parser is currently undergoing field trials, collecting additional data to improve the system in preparation for training it on a wide variety of domains.