The following steps will guide you on how to run a batch test on your bot and obtain a detailed analytical report on the utterances based on the test results.To get started, click Batch Testing in the Natural Language section on the builder.
To run a Test Suite, for example the Developer Defined utterances, click Developer Defined Utterances followed by Run Test Suite. This will initiate the batch test for Developer defined utterances.The test will display the results as depicted here. Each test run will create a test report record and displays summary of test result.
The batch test result in the screenshot above includes following information:
- Last Run Date that displays the Date and time of the latest test run.
- F1 Score that is the weighted average of Precision and Recall.
- Precision that is the number of correctly classified utterances divided by total number of utterances that got classified (correctly or incorrectly) to any existing task.
- Recall that is the number of correctly classified utterances divided by total number of utterances that got classified correctly to any existing task or classified incorrectly as an absence of existing tasks.
- Intent Success % that displays the percentage of correct intent recognition that has resulted from the test.
- Entity Success % that displays the percentage of correct entities recognition that has resulted from the test.
To get a detailed analysis of the test run, click Download icon to download the test report in CSV format. The top section of the report comprises of the summary with following fields:
- Last Tested: Date of the latest test run for developer defined utterances.
- Utterance Count: Total number of utterances included in the test run.
- Success/Failure Ratio: Total number of successfully predicted utterances divided by total count of utterances multiplied by 100.
- True Positive (TP): Percentage of utterances that have correctly matched expected intent.
- True Negative (TN): Percentage of utterances that were not expected to match any intent and they did not match
- False Positive (FP): Percentage of utterances that have matched an unexpected intent.
- False Negative (FN): Percentage of utterances that have not matched expected intent.
The report also provides detailed information on each of the test utterances and the corresponding results.
- Utterances – Utterances used in the corresponding test suite.
- Expected Intent – The intent expected to matched for a given utterance
- Matched Intent – The intent that is matched for an utterance during batch test.
- Parent Intent – The parent intent considered for matching an utterance against an intent.
- Task State – The status of the intent or task against which an intent is identified. Possible values include Configured or Published
- Result Type – Result categorized as True Positive or True Negative or False Positive or False Negative
- Expected EntityValue – The entity value expected to be determined during batch test.
- Matched EntityValue – The entity value identified from an utterance.
- Entity Result – Result categorized as True or False to indicate whether the expected entity value is same as the actual entity value.