はじめに
対話型AIプラットフォーム
チャットボットの概要
自然言語処理(NLP)
ボットの概念と用語
クイックスタートガイド
プラットフォームへのアクセス
ボットビルダーの操作
リリースノート
最新バージョン(英語)
以前のバージョン(英語)
廃止機能(英語)
コンセプト
設計
ストーリーボード
ダイアログタスク
ダイアログタスクとは
ダイアログビルダー
ノードタイプ
インテントノード
ダイアログノード
エンティティノード
フォームノード
確認ノード
ロジックノード
ボットアクションノード
サービスノード
Webhookノード
スクリプトノード
グループノード
エージェント転送ノード
ユーザープロンプト
音声通話プロパティ
ダイアログ管理
イベント ハンドラー
ナレッジグラフ
ナレッジグラフの抽出
ナレッジグラフの構築
ボットにナレッジグラフを追加
グラフの作成
ナレッジグラフの構築
FAQを追加
タスクの実行
既存のソースからFAQを構築
特性、同義語、停止用語
変数ネームスペースの管理
更新
ノード間の質問と回答の移動
用語の編集と削除
質問と応答の編集
ナレッジグラフの分析
通知タスク
スモールトーク
デジタルスキル
デジタルフォーム
デジタルビュー
デジタルビューとは
パネル
ウィジェット
トレーニング
トレーニングとは
機械学習
機械学習とは
モデル検証
ファンダメンタルミーニング
ナレッジグラフ
示唆
ランキングおよび解決
NLPの詳細設定
NLPのガイドライン
インテリジェンス
インテリジェンスとは
コンテキスト
コンテキストインテント
割り込み
複数インテントの検出
エンティティの変更
デフォルトの会話
センチメント管理
トーン分析
テストとデバッグ
ボットと会話
発話テスト
バッチテスト
会話テスト
デプロイ
チャネル
公開
分析
ボットの分析
NLPメトリクス
会話フロー
Usage Metrics
封じ込め測定
カスタムダッシュボード
カスタムダッシュボードとは
メタタグ
カスタムダッシュボードとウィジェット
スマートボット
ユニバーサルボット
ユニバーサルボットとは
ユニバーサルボットの定義
ユニバーサルボットの作成
ユニバーサルボットのトレーニング
ユニバーサルボットのカスタマイズ
他言語の有効化
ストア
プラントと使用
Overview
Usage Plans
Support Plans
Invoices
管理
ボット認証
複数言語対応ボット
個人を特定できる情報の編集
ボット変数の使用
IVRのシステム連携
一般設定
ボット管理
ハウツー
会話スキルの設計
バンキングボットを作成
バンキングボット – 資金の振り替え
バンキングボット – 残高を更新
ナレッジグラフを構築
スマートアラートの予約方法
デジタルスキルの設計
デジタルフォームの設定方法
デジタルビューの設定方法
データテーブルのデータの追加方法
データテーブルのデータの更新方法
Add Data from Digital Forms
ボットのトレーニング
示唆の使用方法
インテントとエンティティのパターンの使用方法
コンテキスト切り替えの管理方法
ボットのデプロイ
エージェント転送の設定方法
ボット関数の使用方法
コンテンツ変数の使用方法
グローバル変数の使用方法
Web SDK Tutorial(英語)
Widget SDK Tutorial(英語)
ボットの分析
カスタムダッシュボードの作成方法
カスタムタグを使ってフィルタリング
管理
ボット管理者コンソール
ダッシュボード
ユーザーの管理
ユーザーの管理
グループの管理
ロール管理
ボット管理モジュール
登録
ユーザーの招待
招待状の一括送信
ユーザーデータのインポート
Active Directoryからユーザーを同期
セキュリティ/コンプライアンス
シングル サインオンの使用
セキュリティ設定
Billing(日本未対応)
  1. ホーム
  2. Docs
  3. Virtual Assistants
  4. Analyzing Your Bot
  5. Virtual Assistant Health and Monitoring

Virtual Assistant Health and Monitoring

The Health and Monitoring dashboard offers a goal-driven approach to improving the accuracy of the virtual assistant’s Natural Language Processing (NLP) model. The training data is analyzed along with the test coverage and test results of the test suites to provide insights into the NLP Model’s performance.

This dashboard lets you achieve the following:

  • Review the test execution summary for every intent type.
  • Identify incorrect intent patterns, short training utterances, incorrect entity annotations, and training recommendations and take corrective action.
  • Drill down to specific test cases to determine test performance and coverage.
  • View the expected and matched results, and the detailed NLP analysis.
  • Tag specific test case results that need follow-up actions and collaborate with your team to improve the performance.

Note: The Health & Monitoring Dashboard is available only post 9.3 release, i.e. post-July 24, 2022.

Navigating to Health and Monitoring

To navigate to the Health and Monitoring dashboard, follow these steps:

  1. Click the Build tab on the top menu of the Virtual Assistant dashboard.
  2. Click Health & Monitoring under Testing in the left navigation menu.

Important Health and Monitoring Metrics

The following metrics help in your ML Model Validation. Learn more.

  • Accuracy: Determines if the intent identified by your ML model is correct or not.
  • F1 Score: Classifies the distribution and balances precision and recall scores. It is calculated as the weighted average of Precision and Recall.
  • Precision Score: Defines how precise/accurate your model is and is calculated as the ratio of true positives over total predicted positives (sum of true and false positives).
  • Recall Score: Defines the fraction of the relevant utterances that are successfully identified and is calculated as the ratio of true positives over actual positives (sum of true positives and false negatives).

Health and Monitoring Dashboard Components

The key components of the NLP Health and Monitoring dashboard include the coverage and execution summary panels under the Bot Health section described below:

Bot Health

The Bot Health section displays the key performance metrics and the total test coverage of the selected test suites for the Dialog intents, FAQs, Small Talks, and Traits. The Health meter depicts if your virtual assistant is trained well or not based on the key recommendation scores.

Note: You can select one, more or all the test suites from the dropdown to view NLP Analytics.

Test Cases Detailed Analysis

To get the detailed NLP analysis data of the test cases, click the View Test Cases link. This displays the summary of all the test cases executed in the selected test suite.

The Test Cases- Detailed Analysis window displays test results separately for Intents, Entities, and Traits as described below, so that you can identify the errors or areas of improvement for each category and fix them.

Intents

The Intents section displays a tabular view of the test cases executed for the Dialog Intents, FAQs, and Small Talks. The primary details displayed from the test results include the following:

  • Test Case: The test case that is executed.
  • Intent Type: Displays Dialog, FAQ, or Small Talk.
  • Test Suite: The test suite to which the test case is mapped.
  • Expected Intent: The intent that is expected to be identified from the given set of utterances.
  • Matched Intent: The intent that is matched during test execution from the given set of utterances.
  • Result Type: Result categorized as True Positive, True Negative, False Positive, or False Negative.

Entities and Traits

The Entities and Traits sections display a tabular view of the test case results for the selected Dialog Intents, FAQs, and Small Talks based on the entities and traits identified respectively. The following primary details are displayed in the respective panels:

Entities

  • Utterances: The utterances from the user’s input captured in the test cases.
  • Entity Name: The entity name identified during test execution.
  • Expected Value: The entity value expected to be identified from the given set of utterances.
  • Matched Value: The entity value identified and matched from the given set of utterances during test execution.
  • Entity Result: Result categorized as True Positive, True Negative, False Positive or False Negative.

Traits

All the primary fields for Intents are displayed along with the Trait Name identified for each test case.

Tags

After analyzing the reason for failure, you can collaborate with your team members using tags for test case executions. Tags are labels mapped to the test case results of intents, entities, and traits, indicating follow-up actions or suggestions.

The following tags are available for intents, entities, and traits:

  • Add Negative Pattern: Indicates that the user has to add a negative pattern to the intent/entity/trait test execution.
  • NeedNLPHelp: Indicates that the test execution requires explicit NLP help.
    Needs Negative Pattern: Indicates that the intent/entity/trait test execution needs a negative pattern to execute as expected.
  • Needs Training: Indicates that the virtual assistant needs training for the identified intent/entity/trait after the test execution.
  • New Intent: Indicates a new intent during test execution.

Analyzing Test Results

The test execution results for the selected test suite(s) and intent type can be analyzed in the details window which provides a drill-down view of the following performance metrics for intents, entities, and traits

Metric Name Description Intent Entity Trait
Expected Intent/Value Please refer to the Intents section. Yes Yes Yes
Matched Intent/Value Please refer to the Intents section. Yes Yes Yes
Parent Intent Learn more. Yes No Yes
Task State The status of the intent or task against which the intent is identified. Possible values include Configured or Published.
.
Yes No Yes
Result Type Please refer to the Intents section. Yes No Yes
Matched Intent Score and Expected Intent Score

Displays the individual scores for the following

Yes No Yes
Entity Name Please refer to the Entities section. No Yes No
Result Returns True if an entity is identified and False if not. No Yes No
Identified by The NLU engine that identified the entity. No Yes No
Identified using The reference entity type that was used to identify the entity during test execution. No Yes No
Confidence Score A score to determine if the test execution resulted in a favorable outcome (high score) or not (low score) when an utterance is trained for the entity. No Yes No

Navigating to the Details Section

To view the Details section, follow these steps:

  1. In the Test Cases – Detailed Analysis window, click the Intents, Entities, or Traits tab.
  2. Hover over the desired entry, and click the detailed view icon.
  3. A sliding window with the test results for the selected test case and intent type appears.Intent and Entity Details

    Trait Details are displayed in the test case details window if you select the trait intent type.

  4. Click the expansion arrow icon under Entity to view the entity order expected by the ML engine and the actual entity order.

NLP Analysis

The NLP Analysis section displays the detailed view of the historic analysis generated at the time of the test case execution for failed and successful test cases. For the selected intent type, this section gives an overview of the intents that are qualified (the definitive and probable matches) and disqualified to serve as crucial information for users trying to decode the reason for failed test cases. The following details are displayed as a graphical representation in this section:

This is different from analyzing the test results under Utterance Testing where the current analysis information is displayed based on the changes to the trained data. Learn more.

To view the NLP Analysis section, follow these steps:

  1. Please follow steps 1 to 3 mentioned in the Details section.
  2. Click the NLP Analysis tab as shown below:

Utterance Testing

Based on the test case failures, you can retrain your virtual assistant using the Utterance testing option for all possible user utterances and inputs. Training is how you enhance the performance of the NLP engine to prioritize one task or user intent over another based on the user input. To learn more, please refer to Training the Bot.

To navigate to the Utterance Testing window, follow these steps:

  1. Click the go to utterance testing (magic wand) icon on the Test Cases – Detailed Analysis page.

In the Utterance Testing window shown below, you can do the following:

  • Test & train your virtual assistant based on these recommendations to understand different user utterances and match them with intents and entities.
  • View the NLP analysis flow and Fields/Entities analysis data including the confidence score based on the NER training.
  • Use the Mark as an incorrect match link to match the user input with the right intent when it is mapped to an incorrect task.

Dialog Intent Summary

This section provides the performance metrics, test coverage and analytics for only the Dialog Intents test cases.

The sub-sections available include:

Test Coverage

This section displays the count and percentage of the intents covered and not covered. You can find the list of intents not covered using the View details option and start adding test cases for them. An Intent is considered as covered when the intent has at least one test case in the selected test suite(s).

Test Results Analysis

This section gives the breakdown of the test case results for the given intent type. The result type could have one of the following values:

  • True Positive (TP): Percentage of utterances that have correctly matched expected intent.
    In the case of Small Talk, it would be when the list of expected and actual intents are the same.
    In the case of Traits, this would include the traits matched over and above the expected matches.
  • False Positive (FP): Percentage of utterances that have matched an unexpected intent. In the case of Small Talk, it would be when the list of expected and actual intents are different.
  • False Negative (FN): Percentage of utterances that have not matched expected intent. In the case of Small Talk, it would be when the list of expected Small Talk intent is blank but the actual Small Talk is mapped to an intent.

Recommendation Notification: Shows any training recommendations available for the dialog intents.

View Recommendations

You can view relevant training recommendations for dialog intents, FAQs, or Small Talks when errors and warnings are triggered during the test execution. To view the recommendations summary, click View Recommendations on the top right of the details page.

To view the details of the utterance validations, errors, warnings, and recommendations and correct them, click the Recommendations column.

Viewing Specific Test Results

To know how to get the drill-down view of a specific test case execution, please refer to the Test Cases – Detailed Analysis section.

FAQ Summary

The FAQ Summary section displays the recommendation scores generated for FAQs from the latest batch test executions.

Viewing Additional FAQ Recommendations

For FAQ Details, clicking View Recommendations will display the report that was already run during the previous run time. To view additional recommendations, run the Inspect function. Learn more.

Knowledge Graph: Clicking this button will take you to the Knowledge Graph section where you can perform KG Analysis.

Small Talk Summary

The Small Talk Summary panel displays the recommendation scores generated for Small Talk interactions from the latest batch test executions.

Small Talk button: Click this button to view the group name and the relevant user utterances, and Bot utterances.

Trait and Entity Summary Information

The Trait Summary and Entity Summary sections display the recommendation scores generated for traits and entities respectively from the latest batch test executions.

Trait Summary

Entity Summary

Test Coverage and Test Results Analysis

Please refer to Test Coverage and Test Results Analysis for information on the sub-sections of these summary panels.

Intent Details Window

The View Details link in the Dialog intent, FAQ, and Small Talk summary sections provides access to a drill-down view of the key performance metrics and recommendations of the covered intents. The given data helps identify the intent-related issues proactively in the training phase itself to work on fixing them accordingly.

Here’s what you can do:

View the Training Data Summary

You can view the training data summary with the relevant recommendation metrics for Dialog Intents, FAQs, and Small Talks in the details panel.

The summary of all the metrics displayed is given below:

Recommendation Metric Dialog Intent FAQ Small Talk
Intent The name of the dialog intent. The name of the FAQ intent. The name of the Small Talk intent.
Utterances The count of the training utterances for that intent. N/A
Test Cases The count of the test cases that are present in the selected test suites for that intent.
True Positive (TP) The count of the intent test cases that resulted in TP.
False Negative (FN) The count of the intent test cases that resulted in FN.
False Positive (FP) The count of the intent test cases that resulted in FP.
Covered In Name of the test suites in which the intent test cases are present.
F1, Accuracy, Precision, and Recall scores These recommendation scores are displayed based on the outcomes.
Recommendations Displays the count of training recommendations for that intent. Clicking on it will display the summary of the training recommendations and their probable corrective actions. N/A N/A
Group N/A N/A The group to which the Small Talk interaction is mapped.
Path N/A The node path in the Knowledge Graph. N/A
Alt Question N/A The number of alternative questions mapped to an FAQ. N/A

View Intents Not Covered

This feature helps identify the intents not covered so as to include them in the test data for better and holistic testing of the virtual assistant. Click the three-dot menu on the right side of the panel to view the list of intents not covered in batch testing.

You can include the intents from this list to retrain your virtual assistant and improve performance.

メニュー