NLP models play a significant role in providing natural conversational experiences for your customers and employees. Improving the accuracy of the NLP models is a continuous journey and requires fine-tuning, as you add new use cases to your virtual assistant. The Kore.ai XO Platform proactively validates the NLP training provided to the virtual assistants and provides recommendations to improve the model. This article explains the available validations, how to view these validations, and how to validate the NLU Model.
Goal-Driven Training Validations
The ML engine enables you to identify issues proactively in the training phase itself with the following set of recommendations:
- Untrained Intents – notifies about intents that are not trained with any utterances so that you can add the required training.
- Inadequate training utterances – notifies the intents with insufficient training utterances so that you can add more utterances to them.
- Utterance does not qualify any intent (false negative) – notifies about an utterance for which the NLP model cannot predict any intent. For example, an utterance added to Intent A is expected to predict Intent A. Whereas in some cases the model won’t be able to predict neither the trained Intent A nor any other Intents within the model. Proactively identifying such cases helps you rectify the utterance and enhance the model for prediction.
- Utterance predicts wrong intent (false positive) – Identifies utterances that predict intents other than the trained intent. For example, when you add an utterance similar to utterances from another intent, the model could predict a different intent rather than the intent it is trained to. Knowing this would help you to rectify the utterance and improve the model prediction.
- Utterance predicts intent with low confidence – notifies about the utterances that have low confidence scores. With this recommendation, you can identify and fix such utterances to improve the confidence score during the virtual assistant creation phase.
- Incorrect Patterns – notifies about the patterns that do not follow the right syntax along with the error. You can resolve such incorrect patterns to improve intent identification.
- Wrong Entity Annotations – notifies wrongly annotated entities. For example, in the utterance ‘I want to travel to Hyderabad on Sunday 2pm’,. the Travel Date (Date type) entity is annotated with value ‘2PM’ (Time value). The platform checks for such wrong annotations and notifies the issue against the utterance, which helps to re-annotate the entity with the right values and improve entity recognition.
- Short Utterance – notifies about the utterances whose word count is lesser than or equal to two. It helps you to follow best practices for the length of utterances, which depicts an actual end-user query and further improves the model’s accuracy.
How to View NLU Training Validations
- On the virtual assistant’s Build menu, click Natural Language -> Training.
- In the Intents tab, you can see the set of recommendations for the Intents and ML utterances.
Note: The errors and warnings in this screen are examples. The ML validations vary based on the error or waning recommendation as explained in the Goal-Based NLU Training Validations section above. - Hover over the validation options and view the following recommendations:
- Hover on the Warning icon and follow the instructions in the warning to enhance the training for ML utterances.
- Once you click on the Intent with error or warning, hover over the Bulb icon to view the summary of error or warning messages as illustrated below:
Note: A warning is displayed when the issue impact the VA’s accuracy and it can be resolved. Warnings are less severe problems when compared to errors.
How to Use the NLU Validate Model
The NLU Validate Model provides the options to view the training module’s recommendations summary, refresh the recommendations, and view Confusions Matrix, and K-fold Cross Validation reports.
The Validate Model helps you to follow best practices and develop accurate Virtual Assistants quickly.
Recommendations Summary
In the Natural Language → Training page, click the Validate Model to see the summary of recommendations.
In the recommendation, 5 intents have Patterns with invalid syntax, 5 is the count of the intents with the issue – patterns with invalid syntax. For any recommendation, the intent count is displayed first, followed by the type of the issue.
If the training model is updated with all the modifications, the platform automatically triggers a background task to refresh recommendations without explicitly informing the users.
Refreshing Recommendations
In the recommendation summary, you can refresh recommendations as explained in the previous section in some of the scenarios like:
When you click the Refresh icon, a timestamp of the last refresh is displayed along with the refreshed recommendations list. The timestamp is updated with every refresh.
If there are untrained utterances in the training model, the platform provides you with the following options:
Train and Regenerate
This option allows you to train the model with untrained utterances, and then triggers a background task to generate recommendations from the latest model.
The following steps explain the Train & Generate usage with an example.
- Click any intent and add a new utterance.
- Once the utterance is added, go to the main page and click the Refresh icon to refresh the recommendations. The following pop-up is displayed when there are untrained utterances.
- Click the Train & Regenerate button to train the utterances and regenerate the recommendations.
- A message about training initiation is displayed.
- Once training and regeneration of recommendations is completed, the status is displayed in the status docker.
- When you click the Refresh icon, the recommendations summary is refreshed and the count of recommendations is either increased, or decreased, or the recommendations are updated based on the latest results.
Regenerate
Click the Regenerate button in the displayed pop-up if you want to trigger a background task that generates recommendations from the current model.
If you click the Refresh icon when the model is being trained, recommendations from the latest Validate model are generated only once the training is completed.
For example, if new utterances are added to the intent, a new recommendation may get added to the summary list. Similarly, if a recommendation is implemented, the count of recommendations decreases. In a recommendation like 13 intents have very short utterances, if 2 intents are fixed, then the recommendation is updated to 11 intents have very short utterances.
NLU Validation Options
Next to the Validate Model, there is a drop-down with options to select either the Confusion matrix or K-fold Cross Validations.
You can also click the Go button to access Confusion Matrix and K-fold Cross Validation reports.
Confusion Matrix
Confusion Matrix is useful in describing the performance of a classification model (or classifier) on a set of test data for which the true values are known. The graph generated by the confusion matrix presents an at-a-glance view of the performance of your trained utterances against the virtual assistant’s tasks. To learn more, see the Confusion Matrix section in Model Validation.
The following screenshot shows the confusion matrix report.
K-Fold Cross Validation
K-Fold Cross-Validation is a resampling procedure used to evaluate machine learning models on a limited data sample. The technique involves partitioning the data into subsets, training the data on a subset, and using the other subsets to evaluate the model’s performance. To learn more, see the K-Fold Cross Validation section in Model Validation.
The following screenshot shows the K-Fold Cross-Validation report.