A component of Kore.ai’s XO Platform, the Knowledge Graph (KG) helps you turn your static FAQ text into an intelligent, personalized conversational experience. It goes beyond the usual practice of capturing FAQs in the form of flat question-answer pairs. Instead, the Knowledge Graph enables you to create an ontological structure of key domain terms and associate them with context-specific questions and their alternatives, synonyms, and Machine learning-enabled traits.
This document explains the concepts, terminology, and implementation of the Knowledge Graph. For a use case driven approach, please , refer here.
Why a Knowledge Graph?
A user expresses a query in multiple ways. It is a difficult task for you to visualize and add all the alternative questions manually.
Thus, Kore.ai has designed the Knowledge Graph with nodes, tags, and synonyms which makes it easier for you to cover all the possible matches. With training, the Knowledge Graph can handle various alternate questions using the nodes, tags, and synonyms.
Whenever a question is asked by the user, the node names in the Knowledge Graph are checked and matched with keywords from the user utterance. Node names, tags, and synonyms are checked and based on a score they receive, questions are shortlisted as likely matches or intents. These shortlisted questions are then compared with the actual user utterance to come up with the best possible intent to present to the user. The response can take the form of either a simple response or execution of a dialog task.
This way, you can add completely different alternative questions in the FAQ and provide tags, synonyms, and node names appropriately such that any untrained question can also be matched. The performance and intelligence of the Knowledge Graph depend on the way you train it with the appropriate node names, tags, and synonyms.
This document is intended to familiarize you with the terms used in building Knowledge Graph.
Terms or Nodes
Terms or Nodes are the building blocks of an ontology and are used to define the fundamental concepts and categories of a business domain.
As shown in the image below, you can organize the terms in a hierarchical order to represent the flow of information in your organization. You can create, organize, edit, and delete terms. There is a platform restriction of 20k maximum number of nodes and 50k number of FAQs.
The Knowledge graph is limited to 20 thousand nodes and 50 thousand FAQs.
For easier representation, we identify nodes using the following names:
The Root node forms the topmost term of your Ontology. A Knowledge Graph consists of only one root node and all other nodes in the ontology become its child nodes. The Root node takes the name of the VA by default, but you can change it if you want to. This node is not used for node qualification or processing. The path qualification starts from first-level nodes. While it is not advisable to have FAQs directly under the root node, in case it is essential to your needs, restrict the number of FAQs to a maximum of 100 at the root node.
The immediate next level nodes after the Root node are known as first-level nodes. There can be any number of first-level nodes in a collection. It is recommended to use first-level nodes to represent high-level terms such as the names of departments or functionalities. For example, in a travel related VA, you might have first level nodes such as: international, reservation or check in.
Any node to which a question-answer set or dialog task, at any level starting with the 2nd, is added is called a Leaf Node.
Depending on their position in the ontology, a node is referred to as first-level nodes, second-level nodes, etc. A first-level node is a category that has one or more sub-categories under it, called the second-level nodes.
For example, reservation is the first-level node of a Travel Assistant. It can be organized into subcategories such as: cancel and update. The Update node can further be subcategorized into subsequent levels, such as: seat, flight and hotel.
Note: This hierarchical organization of nodes is for your convenience to keep related questions together. The Knowledge Graph Engine does not consider any parent-child relation while evaluating the questions for a match. The hierarchy does not in any way influence the FAQ matching processing since all the nodes are considered the same way irrespective of their position in the FAQ organization.
For each term/node, you can add custom tags. Tags work exactly like terms but are not displayed in the Knowledge Graph ontology to avoid clutter. You can add synonyms and traits to tags as you do to terms.
Users use a variety of alternatives for the terms of their ontology. The Knowledge Graph allows you to add synonyms for the terms to include all possible alternative forms. Adding synonyms also reduces the need for training the bot with alternative questions.
For example, the reservation node may have the following synonyms added to it: booking, order, hire, purchase, etc. When you add a synonym for a term in the Knowledge Graph, you can add them as local or global synonyms. Local synonyms (or Path Level Synonyms) apply to the term only in that particular path, whereas global synonyms (or Knowledge Graph Synonyms) apply to the term even if it appears on any other path in the ontology.
Post-release 7.2, you can enable the usage of Bot Synonyms inside the Knowledge Graph engine for path qualification and question matching. With this setting, you need not recreate the same set of synonyms in Bot Synonyms and KG Synonyms.
Note: From v7.0, Traits replace Classes of v6.4 and before.
A trait is a collection of typical end-user utterances that define the nature of a question when they ask for information related to a particular intent. See here for more on traits.
A trait is applied to multiple terms across your Bot Ontology.
Note: Traits also help you filter nodes based on associated user utterances. So, if the user types an utterance that is present in a trait, the bot only searches the nodes to which the trait is applied. If the utterance is present in any other node to which the trait is not applied, the node is ignored.
A VA can respond to a given question from the user either with an execution of a Dialog Task or a FAQ.
- FAQ: The question-answer pairs must be added to relevant nodes in your ontology. A maximum of 50k FAQs is permissible. A question is asked differently by different users and to support this, you must associate multiple alternate forms for each question. Preceding an alternate question with || will allow you to enter patterns for FAQs (post 7.2 release).
- Task: Linking a Dialog task to a KG Intent helps to leverage the capabilities of the Knowledge Graph and Dialog tasks to handle FAQs that involve complex conversations.
The Knowledge Graph engine works well with the default settings. As a VA developer, you can fine-tune the KG engine performance in many ways:
- Configure Knowledge Graph by defining terms, synonyms, primary and alternative questions, or user utterances. Though hierarchy does not affect the KG engine performance, it does help organize and guide your knowledge implementation. .
- Set the following parameters:
- Path Coverage – Define the minimum percentage of terms in the user’s utterance to be present in a path to qualify it for further scoring.
- Definite Score for KG – Define the minimum score for a KG intent match to consider as a definite match and discard any other intent matches found.
- Minimum and Definitive Level for Knowledge Tasks – Define minimum and definitive threshold to identify and respond in case of a knowledge task.
- KG Suggestions Count – Define the maximum number of KG/FAQ suggestions to present when a definite KG intent match is not available.
- The proximity of Suggested Matches – Define the maximum difference to allow between top-scoring and immediate next suggested questions to consider them as equally important.
- While the platform provides default values for the above-mentioned thresholds, these can be customized from the Natural Language > Training > Thresholds & Configurations.
- Qualify Contextual Paths – This ensures that the bot context is populated and retained with the terms/nodes of the matched intent. This further enhances the user experience.
- Traits – As mentioned earlier, traits are used to qualify nodes/terms even if the user utterance does not contain the term/node. Traits are also helpful in filtering the suggested intent list.
How it Works
The Knowledge Graph engine uses a two-step approach while extracting the right response to the user utterance. It combines a search-driven intent detection process with rule-based filtering. The settings for path coverage (percentage of terms needed) and term usage (mandatory or optional) in user utterance helps in the initial filtering of the FAQ intents. Tokenization and the n-gram based cosine scoring model aids in the fulfillment of the final search criteria.
Training of the Knowledge Graph involves the following steps:
- All the terms/nodes along with synonyms are identified and indexed.
- Using these indices, a flattened path is established for each KG Intent.
User Utterance Handling
Once the Knowledge Graph Engine receives a user utterance:
- The user utterance and KG nodes/terms are tokenized, and n-gram is extracted (Knowledge Graph Engine supports a max of quad-gram).
- The tokens are mapped with the KG nodes/terms to obtain their respective indices.
- Path comparison between the user utterance and KG nodes/terms establishes the qualified path for that utterance. This step takes into consideration the path coverage and term usage mentioned above.
- From the list of questions in the qualified path, the best match is picked based upon cosine scoring.