Chatbot Overview
Conversational Bots
Intents & Entities
Intelligent Bots
Kore.ai's Approach
Kore.ai Conversational Platform
Bot Concepts and Terminology
Natural Language Processing (NLP)
Bot Types
Bot Tasks
Starting with Kore.ai Platform
How to Access Bot Builder
Working with Kore.ai Bot Builder
Building your first Bot
Getting Started with Building Bots
Using the Dialog Builder Tool
Creating a Simple Bot
Release Notes
Latest Updates
Older Releases
Bot Builder
Creating a Bot
Design
Develop
Dialog Task
Working with User Intent & Dialog Node
Working with Entity Node
Supported Entity Types
Working with Composite Entities
Supported Time Zones
Supported Colors
Supported Company Names
Working with Message & Confirmation Nodes
Working with Service Node
Implementing Custom Authentication
Enabling 2-way SSL for Service nodes
Working with Script Node
Working with Agent Transfer Node
Working with WebHook Node
Defining Connections & Transitions
Managing Dialogs
Prompt Editor
Context Object
Session and Context Variables
Action & Information Task
Working with Action Tasks
Working with Information Tasks
Establishing Flows
Alert Tasks
Working with Alert Tasks
Managing Ignore Words and Field Memory
Knowledge Tasks
Building Knowledge Graph
Importing and Exporting Bot Ontology
Knowledge Extraction
Natural Language
Overview
Machine Learning
ML Model
Fundamental Meaning
Knowledge Graph Training
Traits
Ranking and Resolver
NLP Detection
NLP Settings and Guidelines
Bot Intelligence
Dialog Management
Context Management
Amend Entity
Multi-Intent Detection
Default Conversations
Channel Enablement
Test & Debug
Talking to Bot
Utterance Testing
Batch Testing
Recording Conversations
Publishing your Bot
Analyzing your Bot
Overview
Dashboard
Conversation Flows
Bot Metrics
Advanced Topics
Bot Authorization
Language Management
Collaborative Development
IVR Integration
Universal Bots
Defining
Creating
Customizing
Enabling Languages
Smart Bots
Defining
Sample Bots
Github
Asana
Travel Planning
Flight Search
Event Based Bot Actions
Sentiment Analysis
Tone Analysis
Sentiment Management
Bot Settings
Bot Functions
General Settings
PII Settings
Customizing Error Messages
Bot Management
API Guide
API Overview
API List
SDKs
SDK Overview
SDK Configuration
SDK Security
SDK App Registration
Kore.ai Web SDK Tutorial
Message Formatting and Templates
Mobile SDK Push Notification
Using the BotKit SDK
Installing the BotKit SDK
Events for the BotKit SDK
Functions for the BotKit SDK
BotKit SDK Tutorial – Agent Transfer
BotKit SDK Tutorial – Flight Search Sample Bot
Using an External NLP Engine
Web Socket Connect & RTM
Bot Administration
Bots Admin Console
User Management
Managing Your Users
Managing Your Groups
Role Management
Bots Management
Enrollment
Inviting Users
Sending Bulk Invites to Enroll Users
Importing Users and User Data
Synchronizing Users from Active Directory
Security & Compliance
Overview
Using Single Sign-On
Cloud Connector
Billing
Bot Store
Overview
Creating a Kore.ai Bot Account
Adding a Kore.ai Bot
Choosing a Channel for a Bot
Interacting with a Kore.ai Bot
Setting Up Web Service Alerts
Setting Up RSS Alerts
Setting Up the Kore.ai Webhook Bot
Custom Kore.ai Bots
Bots for your Customers FAQs
Bots for your Workforce FAQs
Adding Bots
Contacting Kore.ai Support
Setting Up Filters
Bot Store Settings
  1. Home
  2. Docs
  3. Bots
  4. Test your Bot
  5. Utterance Testing

Utterance Testing

To make sure your bot responds to user utterances with related tasks, it’s important that you test the bot with a variety of user inputs. Evaluating a bot with a large sample of expected user inputs not only provides insights into bot responses but also gives you a great opportunity to train the bot in interpreting diverse human expressions.

You can perform all the training-related activities for a bot from the Test & Train module. We will use a sample Flight Booking bot consisting of the following tasks for use as examples across the Test and Train article.

Testing the Bot

Simply put, testing a bot refers to checking if the bot can respond to a user intent with the most relevant task. Given the flexibility of language, users will use a wide range of phrases to express the same intent.

For example, you can rephrase I want to Change my ticket from San Francisco to Los Angeles on Jan 1 as Please change my travel date. Can’t make it on Jan 1. The trick is to train the bot to map both of these intents with the Modify Booking task.

So, the first step to start testing a bot is to identify a representative sample of user utterances to test the bot responses. Look for sources of data that reflect real-world usage of the language, such as support chat logs, online communities, FAQ pages of relevant portals.

How to test the bot

Follow these steps to test a bot:

  1. Open the bot that you want to test.
  2. From the left navigation panel hover over Testing and click Utterance Testing.
  3. In the Type a user utterance field, enter the utterance that you want to test. Example: Rescheduling my plan. Cancel my ticket to LA.

The result appears with a single, multiple, or no matching intents.

Types of Test Results

When you test a user utterance against a bot, the NLP engine tries to find the bot tasks that match the intent. The NLP engine uses a hybrid approach using Machine Learning, Fundamental Meaning, and Knowledge Graph (if the bot has one) models to score the matching intents on relevance. The model classifies user utterances as either being Possible Matches or Definitive Matches.

Definitive Matches get high confidence scores and are assumed to be perfect matches for the user utterance. In published bots, if user input matches with a single Definitive Match, the bot directly executes the task. If the utterances match with multiple Definitive Matches, they are sent as options for the end user to choose one.

On the other hand, Possible Matches are intents that score reasonably well against the user input but do not inspire enough confidence to be termed as exact matches. Internally the system further classifies possible matches into good and unsure matches based on their scores. If the end user utterances were generating possible matches in a published bot, the bot sends these matches as “Did you mean?” suggestions for the end user.

Below are the possible outcomes of a user utterance test:

  • Single Match (Possible or Definitive): The NLP engine finds a match for the user utterance with a single intent or task. The intent is displayed below the User Utterance field. If it is a correct match, you can move on to test the next utterance or you can also further train the task to improve its score. If it is an incorrect match, you can mark it as incorrect and select the appropriate intent.
  • Multiple Matches (Possible or Definitive or Both): NLP engine identifies multiple intents that match with the user utterance. From the results, select the radio button for the matching task and train it.
  • Unidentified Intent: The user input did not match any task in any of the linked bots. Select an intent and train it to match the user utterance.

Analyzing the Test Results

When you test a user utterance, in addition to the matching intents you will also see an NLP Analysis box that provides a quick overview of the shortlisted intents, the NLP models using which they were shortlisted, corresponding scores, and the final winner. Under the Fundamental Meaning tab, you can see the scores of all the intents even if they aren’t shortlisted. As mentioned above, the Kore.ai NLP engine uses Machine Learning, Fundamental Meaning, and Knowledge Graph (if any) models to match intents.

If the NLP engine finds a single Definitive Match through one of the underlying models, you will see the task as the matching intent. If the test identifies more than one definitive matches, you will receive them as options to pick the right intent.

If the models shortlist more than one possible matches, all the shortlisted intents are re-scored by the Ranking and Resolver using the Fundamental Meaning model to determine the final winner. Sometimes, multiple Possible Matches secure the same score even after the rescoring in which case they are presented as multiple matches to the developer to select one. You can click the tab with the name of the learning model in the NLP Analysis box to view the intent scores.

Note: The NLP score is an absolute value and can only be used to compare against other tasks with the same input. Task scores cannot be compared across different utterances.
From each model dialog, clicking the icon on the top right will display the configurations and thresholds in place for the corresponding engines.

Machine Learning (ML) Model

The ML model tries to match the user input with the task label and the training utterances of each task. If the user input consists of multiple sentences, each sentence is run separately against the task name as well as the task utterances.

Click on the Machine Learning Model button to open the Machine Learning Model section of NLP Analysis. This shows only the names of the tasks that secure a positive score. In general, the more the number of training utterances that you add to a task, the greater are its chances for discovery. For more information, read Machine Learning.

Fundamental Meaning Model

Apart from the ML model, each task in the bot is also scored against the user input using a comprehensive custom NLP algorithm that involves different combinations of task names, synonyms, and patterns.

The Fundamental Meaning Model tab shows the analysis for all the intents in the bot. Click the tab to view the scores of each task.

Clicking the Processed Utterance shows how the user utterance was analyzed and processed.

Next, the list of intents processed is displayed. Selecting the intent (matched or eliminated) displays the details of how the scores are calculated as explained below

  • Words Matched: The score given for the number of words in the user input that matched words in the task name or a trained utterance for the task.
  • Word Coverage: The score given for the ratio of the words matched with that of the overall words in the task, including task name, field names, utterances, and synonyms.
  • Exact Words: The score given for the number of words that matched exactly and not by synonyms.
  • Bonus
    • Sentence Structure: Bonus for the sentence structure match to the user input.
    • Word Position: Score given to a word based on its position in a sentence Individual words towards the start of the sentence are given higher preference. Extra credit if the word is near to the sentence start.
    • Order Bonus: Bonus for the number of words in the same order as the task label.
    • Role Bonus: Bonus for the number of primary and secondary roles (subject/verb/object) matched.
    • Spread Bonus: Bonus for the difference between the position of first and last matched words in a pattern. The higher the difference, the greater the score.
  • Penalty: Penalty if there are several phrases before the task name or if there is a conjunction in the middle of the task label.

Knowledge Collection

If the bot consists of a Knowledge Graph, the user utterances are processed to extract the terms and are mapped with the Knowledge Graph to fetch the relevant paths. All the paths containing more than a preset threshold of the number of terms get shortlisted for further screening. Path with 100% terms covered and having a similar FAQ in the path is considered a perfect match.

Ranking and Resolver

Ranking and Resolver determines the final winner of the entire NLP computation. If either the ML model or the Knowledge Graph find a perfect match, the ranking and resolver doesn’t re-score the intent and presents it as a matched intent. Even if there are multiple perfect matches, they will be presented as options to the developers from which they can choose.

The Ranking and Resolver re-scores all the other good and unsure matches identified by the three models using the Fundamental Learning model. After re-scoring, if the final score of an intent crosses a certain threshold, it too is considered as a match.

Clicking the Ranking and Resolver will give the details.

The ranking and details for each match can be viewed by selecting the matched utterance.

Training the Bot

Training is how you enhance the performance of the NLP engine to prioritize one bot task or user intent over another based on the user input. You should test and, if needed, train your bot for all possible user utterances and inputs.

Train the bot

  1. After you enter a User Utterance, depending on the test result do one of the following to open the training options:
    1. For an unmatched intent: From the Select an Intent drop-down list, select the intent that you want to match with the user utterance.
    2. For multiple matched intents: Select the radio button for the intent you want to match.
    3. For a single matched intent: Click the name of the matched intent.
  2. The user utterance that you entered gets displayed in the field under the ML Utterances section. To add the utterance to the intent, click Save & Train. You can add as many utterances as you want, one after another. For more information, read Machine Learning.
  3. Under the Intent Synonyms section, each word in the task name appears as a separate line item. Enter the synonyms for the words to optimize the NLP interpreter accuracy to recognize the correct task. For more information, read Managing Synonyms.
  4. Under the Intent Patterns section, enter task patterns for the intent. For more information, read Managing Patterns.
  5. When you are done making the relevant training entries, click Re-Run Utterance to see if you have improved the intent to get a high confidence score.

Mark an Incorrect Match

When a user input matches an incorrect task, do the following to match it with the right intent:

  1. Above the matched intent name, click the Mark as incorrect match link. It opens the Matched Intent drop-down list to select another intent.
  2. Select the corresponding intent for the user input and train the bot.
Menu