One of the most recognisable applications of Artificial Intelligence in our daily lives are voice assistants. Our smartphones have had them for quite a while, our speakers now have them and by the looks of it, the next home appliance that you buy might have a voice assistant integrated in it. The convenience of interacting with a voice assistant is one of the key factors for their pervasiveness in the digital space, and it’s because voice assistants have improved so much that conversing with one feels quite natural. Sure, they aren’t so good that we can have long casual contextual conversations but the very fact that they can understand the intent of the user is a huge leap in their intelligence. It’s not just voice assistants but any AI that has to interpret the human language has shown these traits. The credit for all of this improvement goes to the Natural Language Processing (NLP) library that’s running the show behind the curtains. The modern NLP library has a lot of different techniques by means of which it can interpret human language and one of them is Intent Extraction. One such library is NLP Architect by Intel® AI Lab and we’re going to be exploring how it performs Intent Extraction.
What is Intent Extraction?
Intent Extraction is a technique or a type of Natural-Language-Understanding (NLU) task that helps a program to understand the type of action that is conveyed in a sentence, the assignee to whom the action applies to, and the assignor who has to perform the action. For example, if you were to pick up your Android smartphone and give it the following command:
“Ok Google, remind me to check my appointments tomorrow at 6 AM.”
Your Google Assistant will give you the following response:
“OK, I’ll remind you tomorrow at 06:00”
And then it will set a reminder for 6 AM the next day with the text “Check my appointments.”
Here, you have asked the Google Assistant to “set a reminder,” which is the action to be performed because the NLP library detected the verb remind in the sentence. It then identifies that the setting a reminder would entail understanding a few key elements. In this case, a reminder would have a date/time, an executor and an object. You are the executor, the time is 6 AM, the date is the next day from which this statement was spoken and the object(ive) is to check your appointments.
NLP Architect by Intel® AI Lab
NLP Architect is an open-source Python library by Intel® AI Lab and it is designed for exploring the state-of-the-art deep learning topologies and techniques for natural language processing and natural language understanding. Intel intends it to be a platform for future research and collaboration. NLP Architect can be downloaded from Github: (https://github.com/NervanaSystems/nlp-architect)
The library contains NLP and NLU models pertaining to a wide gamut of topics, some of which are:
● Dependency parsing
● Intent detection and Slot tagging model for Intent based applications
● Memory Networks for goal-oriented dialog
● Noun phrase embedding vectors model
● Noun phrase semantic segmentation
● Named Entity Recognition
● Word Chunking
● Reading comprehension
● Language modeling using Temporal Convolution Network
● Unsupervised Crosslingual Word Embedding
● Supervised sentiment analysis
● Sparse and quantized neural machine translation
● Relation Identification and cross document coreference
Intent Extraction Models
There is never a one shoe fits all solution for performing any task which is why libraries tend to have multiple techniques of performing a certain action. When it comes to intent extraction, the first step is to transcribe the users’ speech into text by means of an automatic speech recogniser or the text can be simply input by the user. The intent and all other arguments required by the software are then identified. One of the common techniques is to perform slot tagging i.e. the software tries to determine a map (F:X→Y) from the sequence of words (input) to slot labels where X is the word set with vocabulary size N, and Y the slot set with vocabulary size M. NLP Architect by Intel® AI Lab supports two models for performing intent extraction: 1. Multi-task Intent and slot tagging model 2. Encoder-Decoder topology for slot tagging
Multi-task Intent and slot tagging model
This is quite similar to the slot tagging model we mentioned previously. It has 2 sources of input:
2. Characters of words
The three main features that set it apart from other Intent Extraction models:
1. Character information embedding acting as a feature extractor of words
2. CRF (Conditional Random Field) classifier for slot labels
3. Cascading structure of the intent and tag classification
This model performs intent classification by encoding the context of the sentences using word embeddings by a bi-directional LSTM (Long Short-Term Memory) layer and by training a classifier on the last hidden state of the LSTM layer using a softmax function. Word-character embeddings are created using a bidirectional LSTM encoder which concatenates the last hidden states of the layers. The encoding of the word-context, thus obtained, is concatenated with the word-character embeddings and sent through another bidirectional LSTM which outputs the final context encoding that a CRF layer then uses for slot tag classification.
In the above image, X1-XN are the words that are input, W denotes the word embeddings, C denotes the Word-character embeddings.
Encoder-Decoder topology for slot tagging
This is an equally well-known Long Short-Term Memory topology for performing sequence-to-sequence classification. It supports arbitrary depths of LSTM layers in both, the encoder as well as the decoder. Similar topologies have achieved an F1 score of 95.66% in the slot filling task of the standard ATIS benchmark.
It uses two types of encoder-labeler LSTM that use the labeler LSTM(W) and the labeler LSTM(W+L). Ref Fig(D) in the image below, the encoder LSTM on the left of the dotted line reads the input sentence from right to left i.e. backwards. Its last hidden state contains the encoded information of the input sentence. The labeler on the right of the dotted line has its hidden state initialised with the last hidden state of the encoder LSTM. The labeler LSTM(W) uses this information to predict the slot label, therefore, the slot filling considers the sentence-level information into consideration.
SNIPS and ATIS Benchmarks
SNIPS in an NLU (Natural Language Understanding) benchmark containing approximately 16,000 sentences consisting of 7 intent types. Each Intent in the dataset has about 2400 sentences to be used towards training and 100 sentences to be used towards validation. The 7 user intents include:
1. Get weather
2. Play music
3. Book restaurant
4. Search item
5. Add to playlist
6. Rate book
7. Search movie schedule
In a similar vein, ATIS (Air Travel Information Service) is another benchmark commonly used to gauge the accuracy of intent extraction models. Here are the F1-scores for both the data sets. Each model was trained for 100 epochs using default parameters.
NLP Architect by Intel® AI Lab is an open-source library which not only incorporates cutting edge deep learning topologies and techniques for NLP and NLU, but is also optimised to run with some of the most widely used deep learning frameworks in the industry such as Tensorflow, Dynet and Intel-Optimized TensorFlow with MKL-DNN. Intel is actively researching and developing new methodologies in the NLP domain which can be incorporated into this library. Moreover, the open-source nature of it encourages community contribution to further enrich the library.
From the benchmarks we can see that NLP Architect by Intel® AI Lab offers near best in class performance for models that deal with intent extraction but it’s also just as competitive with models that deal with word chunking, named entity recognition, dependency parsing, sentiment classification and language models.
For more details visit IDZ- Software.intel.com