GETTING STARTED
Kore.ai XO Platform
Virtual Assistants Overview
Natural Language Processing (NLP)
Concepts and Terminology
Quick Start Guide
Accessing the Platform
Working with the Builder
Building a Virtual Assistant
Using Workspaces
Release Notes
Current Version
Previous Versions
Deprecations

CONCEPTS
Design
Storyboard
Dialog Tasks
Overview
Dialog Builder
Node Types
Intent Node
Dialog Node
Entity Node
Form Node
Confirmation Node
Message Nodes
Logic Node
Bot Action Node
Service Node
Webhook Node
Script Node
Group Node
Agent Transfer
User Prompts
Voice Call Properties
Dialog Task Management
Connections & Transitions
Component Transition
Context Object
Event Handlers
Knowledge Graph
Introduction
Knowledge Extraction
Build Knowledge Graph
Add Knowledge Graph to Bot
Create the Graph
Build Knowledge Graph
Add FAQs
Run a Task
Build FAQs from an Existing Source
Traits, Synonyms, and Stop Words
Manage Variable Namespaces
Update
Move Question and Answers Between Nodes
Edit and Delete Terms
Edit Questions and Responses
Knowledge Graph Training
Knowledge Graph Analysis
Knowledge Graph Import and Export
Importing Knowledge Graph
Exporting Knowledge Graph
Creating a Knowledge Graph
From a CSV File
From a JSON file
Auto-Generate Knowledge Graph
Alert Tasks
Small Talk
Digital Skills
Digital Forms
Views
Introduction
Panels
Widgets
Train
Introduction
ML Engine
Introduction
Model Validation
FM Engine
KG Engine
Traits Engine
Training Validations
Ranking and Resolver
NLP Configurations
NLP Guidelines
Intelligence
Introduction
Contextual Memory
Contextual Intents
Interruption Management
Multi-intent Detection
Amending Entities
Default Conversations
Sentinment Management
Tone Analysis
Test & Debug
Talk to Bot
Utterance Testing
Batch Testing
Conversation Testing
Health and Monitoring
Deploy
Channels
Publish
Analyze
Introduction
Overview Dashboard
Conversations Dashboard
Users Dashboard
Performance Dashboard
Custom Dashboards
Introduction
Meta Tags
Dashboards and Widgets
NLP Insights
Conversations History
Conversation Flows
Analytics Dashboard Filters
Usage Metrics
Containment Metrics
Smart Bots
Universal Bots
Introduction
Universal Bot Definition
Universal Bot Creation
Training a Universal Bot
Universal Bot Customizations
Enabling Languages
Store
Manage Assistant
Plan & Usage
Overview
Usage Plans
Support Plans
Invoices
Authorization
Multilingual Virtual Assistants
Masking PII Details
Variables
IVR Settings
General Settings
Assistant Management
Data as Service
Data Table
Table Views
App Definitions
Sharing Data Tables or Views

HOW TOs
Build a Flight Status Assistant
Design Conversation Skills
Create a Sample Banking Assistant
Create a Transfer Funds Task
Create a Update Balance Task
Create a Knowledge Graph
Set Up a Smart Alert
Design Digital Skills
Configure Digital Forms
Configure Digital Views
Add Data to Data Tables
Update Data in Data Tables
Add Data from Digital Forms
Train the Assistant
Use Traits
Use Patterns for Intents & Entities
Manage Context Switching
Deploy the Assistant
Configure an Agent Transfer
Use Assistant Functions
Use Content Variables
Use Global Variables
Web SDK Tutorial
Widget SDK Tutorial
Analyze the Assistant
Create a Custom Dashboard
Use Custom Meta Tags in Filters

APIs & SDKs
API Reference
API Introduction
API List
API Collection
koreUtil Libraries
SDK Reference
SDK Introduction
SDK Security
SDK Registration
Web Socket Connect and RTM
Using the BotKit SDK
BotKit SDK Tutorial - Blue Prism

ADMINISTRATION
Introduction
Assistant Admin Console
Administration Dashboard
User Management
Add Users
Manage Groups
Manage Roles
Assistant Management
Enrollment
Invite Users
Send Bulk Invites
Import User Data
Synchronize Users from AD
Security & Compliance
Using Single-Sign On
Security Settings
Cloud Connector
Analytics
Billing
  1. Home
  2. Docs
  3. Virtual Assistants
  4. Analyzing Your Bot
  5. Virtual Assistant Health and Monitoring

Virtual Assistant Health and Monitoring

The Health and Monitoring feature offers a goal-driven approach to improving the accuracy of the virtual assistant’s NLP model. The training data is analyzed along with the test coverage and test results of the Batch Test suites to provide insights into the performance of the NLP Model. The feature also recommends corrective actions to address issues identified in the training data.

Note: The Health & Monitoring Dashboard is available only post 9.3 release, i.e. post-July 24, 2022.

To access the Health and Monitoring Dashboard, follow these steps:

  1. Click the BUILD tab on the top menu of the Virtual Assistant dashboard.
  2. Click Health & Monitoring under Testing on the left navigation menu.

This dashboard lets you achieve the following:

  • Analyze the virtual assistant’s test coverage and key test performance metrics derived from the latest, in-development batch test executions to determine areas of improvement.
  • Drill down to specific intents and their test cases by looking at their test performance and coverage.
  • Identify Training proactive recommendations to fix incorrect patterns, short utterances,incorrect entity annotations and so on.

Important Health and Monitoring Metrics

The following metrics are useful in validating your ML model’s performance. Learn more.

  • Accuracy: Determines if the intent identified by your ML model is correct or not.
  • F1 Score: Evens out the class distribution and seeks a balance between precision and recall scores. It is calculated as the weighted average of Precision and Recall.
  • Precision Score: Defines how precise/accurate your model is and is calculated as the ratio of true positives over total predicted positives (sum of true and false positives).
  • Recall Score: Defines the fraction of the relevant utterances that are successfully identified and is calculated as the ratio of true positives over actual positives (sum of true positives and false negatives).

Health and Monitoring Dashboard Components

The key components of the NLP Health and Monitoring dashboard include coverage and execution summary panels for the selected test suite. The details on the available panels are given below:

Bot Health

This section displays the key recommendation metrics (Accuracy, F1 score, Precision, and Recall) and the total test coverage of the selected test suites. The health meter shows the status based on the total coverage and performance metrics. The summary for Training Accuracy, F1, Precision, and Recall scores of the latest batch test executions are displayed. These are aggregation results of dialog, FAQ, small talk, and traits. The Total Test Coverage of the NLP model is also auto-calculated and displayed as the percentage of intents covered and not covered out of the total intents identified. A meter gauge displays if your bot is trained well or not based on these scores.

NLP Health

This section displays the test coverage and recommendation scores for the following metrics based on the selected Test Suite to determine the NLU model performance and bot training needs.

View Test Cases

To view the details of the test cases along with their NLP analysis, click the View Test Cases link in the Health and Monitoring dashboard.

The Test Cases – Detailed Analysis window displays the following sections:

Intents

This section displays a tabular view of the Test Cases executed for the Dialog Intents, FAQs, and Small Talk. You can also add tags to collaborate with your team members and view the historic NLP analysis of the test case execution. For test analysis, the below details are shown from the test results.

  • Expected Intent
  • Matched Intent
  • Parent Intent
  • Task State
  • Result Type
  • Engine Confidence Scores (show whichever is available and NIL value for engines with no prediction)
    • ML score
    • FM score
    • KG score
    • RR score

The complete flow of steps in testing and training a virtual assistant for various user inputs is given below:

1. Create a test suite and provide test case details.

2. Run the test suite and get the summary of test results.

3. Analyze the test results to identify the problem areas from the summary information on the bot, intent-type, individual intent summary, and other recommendations.

4. Decode the reason for failure using the details from the NLP analysis.

5. After analyzing the reason for failure, collaborate with team members using these tags, and retrain the model by navigating to the utterance testing option.

6. Train the virtual assistant to enhance the performance of the NLP engine to prioritize one bot task or user intent based on the user input.

Entities

During training, NLP Entity Detection is done to train entities based on the type of entity detected.

For test analysis, the below details are shown from the test results:

  • Entity Name
  • Expected Value
  • Matched Value
  • Result
  • Expected Order
  • Actual Order
  • Identified by
  • Identified using NER Confidence score

The actual results of the entity match are displayed in a tabular view where the following details are captured:

  • Utterances captured during a conversation.
  • The identified entity name.
  • The expected value set in the system.
  • The matched value after testing.
  • The Entity result (True or False).
  • Tags mapped to the entities that indicate the recommendations to improve virtual assistants.

Learn more.

The actual results of the entity match are displayed in a tabular view where the following details are captured:

  • Utterances captured during a conversation.
  • The identified entity name.
  • The expected value set in the system.
  • The matched value after testing.
  • The Entity result (True or False).
  • Tags mapped to the entities that indicate the recommendations to improve virtual assistant.

Traits

This section displays a tabular view of the Test Cases executed for the Intent Type (Trait), Test Suite, Trait Name (for example: new requests), Expected Trait, Matched Trait, Trait Result (False Positive, True Negative, or False Negative), and the tags (same as above) mapped to the test case indicating the required action.

Dialog Intent Summary

This section provides the performance metrics, test coverage and analytics for only the dialog intent test cases.

The Test Coverage section displays the count and percentage of covered and not covered dialog intents. An Intent is considered as covered when the intent has at least one test case in the selected test suites.

The Test Results Analysis section gives the breakdown of result type of Dialog intent test cases. The result type could have either of these values (TP/FP/FN).

Recommendation Notification: Shows if there are any training recommendations available for the dialog intents.

Click View Details on the top right to get a drill-down view of the Individual dialog intents and their summary. This helps you to know the performance of each intent and make informed decisions on what needs to be fixed first.

Dialog Intent Details Window

This window gives the drill-down view of the individual intents and their summary that are covered in the selected test suites. The primary objective here is to help users know the top-performing and low-performing intents, along with their training recommendations.

After viewing the intent type summary in the Health and Monitoring dashboard, the user can now focus on individual intent scores for a deeper level of understanding of the intents’ performance through the various scores

The following metrics are displayed for intents covered in the selected test suites:

  • Intent: The name of the dialog intent.
  • Utterances: The count of the training utterances for that intent.
  • Test Cases: The count of the test cases present in the selected test suites for that intent.
  • True Positive (TP): The count the intent test cases that resulted in TP.
  • False Negative (FN): The count of the intent test cases that resulted in FN.
  • False Positive (FP): The count the intent test cases that resulted in FP.
  • F1, Accuracy, Precision, and Recall scores.
  • Covered In: The name of the test suites in which the intent test cases are present
  • Recommendations: Displays the count of training recommendations for that intent. Clicking on it will display the summary of the utterance and pattern validations under errors and warnings, and the suggested corrections.
  • You can go to the training module by clicking on View Patterns and View Utterances.

    View Recommendations

    Click the Recommendations button on the top right to view the summary of all the training recommendations.

    View Intents not Covered

    Click the three-dot menu icon on the right side of the panel to view the intents not covered for Dialogs in batch testing.

    This feature helps identify the intents not covered so as to include them in the test data for better and holistic testing of the virtual assistant.

    FAQ Summary

    The FAQ Summary section displays the Training Accuracy, F1, Precision, and Recall scores generated for all the FAQ intents in all the test cases of the selected test suite as discussed in the Dialog Intent Summary section.

    Please refer to the Dialog Intent Summary section to view the metrics and scores displayed for FAQs.

    Click View Details to get a drill-down view of the FAQ scores and recommendations information (counts, errors and warnings, and corrective suggestions) on the FAQ Details window.

    FAQ Details Window

    This window gives the drill-down view of the FAQ-level details for the selected test suite. The primary objective here is to help users know the top-performing and low-performing FAQs with the help of the following metrics and recommendations and identify issues proactively in the training phase itself to work on fixing them accordingly.

    • Intent: The name of the dialog intent.
    • Path: The node path in the Knowledge Graph.
    • Alt Question: The number of alternative questions mapped to an FAQ.
    • Test Cases: The count of the test cases present in the selected test suites for that FAQ intent.
    • True Positive (TP): The count the intent test cases that resulted in TP.
    • False Negative (FN): The count of the intent test cases that resulted in FN.
    • False Positive (FP): The count of the intent test cases that resulted in FP.
    • F1, Accuracy, Precision, and Recall scores.
    • Covered In: Name of the test suites in which the FAQ intent test cases are present.

    View Recommendations

    Click the View Recommendations button on the top right to view the summary of all the FAQ intent training recommendations. This section shows the recommendations for KG training from the KG Inspect functionality. It will show the report if it was already run along with the last run time. Or else, you can run the Inspect from here to see if there are any recommendations. Learn more.

    • Knowledge Graph: Clicking this button will take you to the Knowledge Graph section.

    View Intents not covered

    Click the three dot menu icon on the right side of the panel to view the intents not covered for FAQs in batch testing. The functionality is the same as Dialog Intents.

    Small Talk

    The Small Talk Summary section displays the Training Accuracy, F1, Precision, and Recall scores generated for all the Small Talk intents in the selected test suite as discussed in the Dialog Intent Summary section.

    Click View Details to view the Small Talk Details window which gives the drill-down view of the Small Talk intent-level details for the selected test suite. The primary objective here is to help users know the top-performing and low-performing intents with the help of the following metrics and recommendations, and identify issues proactively in the training phase itself to work on fixing them accordingly.

    • Intent: The customer’s intent that is captured in the Small Talk interaction.
    • Group: The group to which the Small Talk interaction is mapped.
    • Test Cases: The test scenarios that need to be run for Small Talk during batch testing execution.
    • True Positive (TP): The outcome where the NLU model correctly predicts the Small Talk intent.
    • False Negative (FN): The outcome where the NLU model incorrectly predicts the Small Talk intent.
    • False Positive (FP): The outcome where the NLU model incorrectly predicts the Small Talk intent.
    • F1, Accuracy, Precision, and Recall scores.
    • Covered In: The test suite that covers the Small Talk intents, utterances, and patterns.
    • Small Talk: Clicking this button displays the Group name and the relevant User utterances, and Bot utterances.

    View Intents not covered

    Click the three-dot menu icon on the right side of the panel to view the intents not covered for Small Talk in batch testing. The functionality is the same as Dialog Intents.

    Trait Summary

    The Trait Summary section displays the Training Accuracy, F1, Precision, and Recall scores generated for all the traits in the test cases of the selected test suite as discussed in the Dialog Intent Summary section.

    The Test Coverage section displays the same information that is displayed for Dialog Intents for traits coverage.

    The Test Results Analysis section displays the same information that is displayed for Dialog Intents in the context for traits

    Entity Summary

    The Entity Summary section displays the Training Accuracy, F1, Precision, and Recall scores generated for all the entities identified in the test cases of the selected test suite as discussed in the Dialog Intent Summary section.

    The Test Coverage section displays the same information that is displayed for Dialog Intents for entities coverage.

    The Test Results Analysis section displays the same information that is displayed for Dialog Intents in the context for entities.

Menu