Chatbot Overview
Conversational Bots
Intents & Entities
Intelligent Bots
Kore.ai's Approach
Kore.ai Conversational Platform
Bot Concepts and Terminology
Natural Language Processing (NLP)
Bot Types
Bot Tasks
Starting with Kore.ai Platform
How to Access Bot Builder
Working with Kore.ai Bot Builder
Building your first Bot
Getting Started with Building Bots
Using the Dialog Builder Tool
Creating a Simple Bot
Release Notes
Latest Updates
Older Releases
Deprecations
Bot Builder
Creating a Bot
Design
Develop
Storyboard
Dialog Task
User Intent Node
Dialog Node
Entity Node
Supported Entity Types
Composite Entities
Supported Time Zones
Supported Colors
Supported Company Names
Form Node
Logic Node
Message Nodes
Confirmation Nodes
Service Node
Custom Authentication
2-way SSL for Service nodes
Script Node
Agent Transfer Node
WebHook Node
Grouping Nodes
Connections & Transitions
Managing Dialogs
Prompt Editor
Alert Tasks
Alert Tasks
Ignore Words and Field Memory
Digital Forms
Digital Views
Knowledge Graph
Terminology
Building
Generation
Importing and Exporting
Analysis
Knowledge Extraction
Small Talk
Action & Information Task
Action Tasks
Information Tasks
Establishing Flows
Natural Language
Overview
Machine Learning
ML Model
Fundamental Meaning
NLP Settings and Guidelines
Knowledge Graph Training
Traits
Ranking and Resolver
NLP Detection
Bot Intelligence
Overview
Context Management
Session and Context Variables
Context Object
Dialog Management
Sub-Intents
Amend Entity
Multi-Intent Detection
Sentiment Management
Tone Analysis
Sentiment Management
Default Conversations
Default Standard Responses
Channel Enablement
Test & Debug
Talk to Bot
Utterance Testing
Batch Testing
Record Conversations
Publishing your Bot
Analyzing your Bot
Overview
Dashboard
Custom Dashboard
Conversation Flows
Bot Metrics
Advanced Topics
Bot Authorization
Language Management
Collaborative Development
IVR Integration
Data Table
Universal Bots
Defining
Creating
Training
Customizing
Enabling Languages
Smart Bots
Defining
Sample Bots
Github
Asana
Travel Planning
Flight Search
Event Based Bot Actions
koreUtil Libraries
Bot Settings
Bot Functions
General Settings
PII Settings
Customizing Error Messages
Bot Management
Bot Versioning
Using Bot Variables
API Guide
API Overview
API List
API Collection
SDKs
SDK Overview
SDK Security
SDK App Registration
Web SDK Tutorial
Message Formatting and Templates
Mobile SDK Push Notification
Widget SDK Tutorial
Widget SDK – Message Formatting and Templates
Web Socket Connect & RTM
Using the BotKit SDK
Installing
Configuring
Events
Functions
BotKit SDK Tutorial – Agent Transfer
BotKit SDK Tutorial – Flight Search Sample Bot
Using an External NLP Engine
Bot Administration
Bots Admin Console
Dashboard
User Management
Managing Users
Managing Groups
Managing Role
Bots Management
Enrollment
Inviting Users
Bulk Invites
Importing Users
Synchronizing Users from AD
Security & Compliance
Using Single Sign-On
Security Settings
Cloud Connector
Analytics
Billing
How Tos
Creating a Simple Bot
Creating a Banking Bot
Transfer Funds Task
Update Balance Task
Context Switching
Using Traits
Schedule a Smart Alert
Configure Digital Forms
Add Form Data into Data Tables
Configuring Digital Views
Add Data to Data Tables
Update Data in Data Tables
Custom Dashboard
Custom Tags to filter Bot Metrics
Patterns for Intents & Entities
Build Knowledge Graph
Global Variables
Content Variables
Using Bot Functions
Configure Agent Transfer
  1. Home
  2. Docs
  3. Bots
  4. Bot Building
  5. Knowledge Graph
  6. Knowledge Extraction

Knowledge Extraction

The Knowledge Graph Extraction Service enables you to effortlessly move your enterprise’s existing Frequently Asked Questions- FAQ content – into bot Knowledge Graph.

The feature supports the extraction from unstructured content such as web pages and PDF documents as well as from structured content such as CSV files.

After completing the extraction, you can edit the question and answers using an easy-to-use interface and organize them under the relevant Knowledge Graph nodes.

Extraction Process

Moving data using the Knowledge Extraction Service to the Knowledge Graph involves the followings steps:

  • Step 1 Extracting: Extract the existing FAQ content from structured or unstructured sources of question-answer data such as PDF, web pages, and CSV files. This extraction can be done before or after creating a Knowledge Graph for the bot.
    Note: The Knowledge Extraction service supports a specific content structure for each source type. Refer to the Supported formats section for details.
  • Step 2 Editing: Upon successful data extraction, you can edit the questions and answer text before moving it to the Knowledge Graph.
  • Step 3 Moving: You can add data into a Bot before or after creating a Knowledge Graph (KG). If you try to add the extracted content to a KG before it exists, the bot creates automatically create one with the Bot’s name.

The Knowledge Extractor allows you to add the extracted content to the Knowledge Graph:

  • Add to Knowledge Graph moves the selected questions to the root node of the Knowledge Graph. You can use this option when the required term is not yet added to the KG or when the Bot does not have a Knowledge Graph.
  • Add to Specific Term: If the bot already consists of a Knowledge Graph, you drag-drop the selected content to the required nodes.

Extracting from a Website

  1. Open the bot into which you want to extract the content and click the Knowledge Graph tab.
  2. Under Extracts section, click Extract from URL.
  3. Enter a Name for extraction.
  4. Enter the URL of the page, and then click Proceed.
  5. Once the extraction is complete, the page with the Success status appears.

Extracting from File

NOTE: File size should not be more than 5MB.

For file format details, refer to the Supported formats section below.

  1. Open the bot into which you want to extract the content and click the Knowledge Graph tab.
  2. From the Extracts section, from the tab Extract from file, click Browse.
  3. Browse to the file – PDF or CSV formatted, select the file and then click Proceed.
  4. Once the extraction is complete, a page with the Success status will be displayed.

Editing the Extracted Content

  1. Open the bot and click the Knowledge Graph tab.
  2. The Knowledge Extraction section displays the list of all extractions.
  3. Click the name of a successful extract whose content you want to edit.
  4. Hover over the question-answer pair to modify it and click the edit icon.
  5. Make the necessary changes and click Save.

Adding the Extracted Content

There are two ways to add the extracted content to the Knowledge Graph.

From the Extracts Section

  1. Open the bot and click the Knowledge Graph tab.
  2. From the Knowledge Extraction section, select the name of a successful extract whose content you want to add.
  3. Drag and drop the required Q&A to the node/term to which you want to add. As you drag and drop the child nodes will be expanded.
  4. You can select multiple Q&As and do a bulk move.

From Knowledge Graph

  1. Open the bot and click the Knowledge Graph tab.
  2. Select the node to which you want to add these Question-Answers.
  3. Click Add from extraction. It opens the list of successful and failed extractions.
  4. Click the name of a successful extract whose content you want to move.
  5. Select the checkboxes next to the question-answer pairs that you want to move and then click Add.

Note: Once you move a question-answer pair from the extract to the knowledge graph, you cannot move it again. The platform throws a duplicate error when you try to move a question from the extract that’s already present in the collection. You can make any changes to the moved content from the knowledge graph. However, if the question is modified or removed in the knowledge graph, then the developer would be allowed to add it again to the knowledge graph.

Supported Formats

The Knowledge Extraction service supports extracting FAQs only from supported CSV, PDF and URL formats.

Note that the file size should not be more than 5MB.

CSV

  • The Knowledge Extraction service interprets the text in the first column as a question and that in the second column as an answer.
  • The file should not have any headers.
  • The Knowledge Extraction service ignores any headers and the text present in the other columns.

PDF

  • The Knowledge Extraction service can process the content from a PDF and convert into question-answer pairs.
  • Documents with table of contents: Ideally a document with a table of contents is preferred. In such cases, the Knowledge Extraction service extracts the table of contents first and then uses it to parse the document and identify headings. The information present in the table of contents is used to derive the hierarchy of headings (headings. subheadings, sub-subheadings, etc.). These levels will be separated by a vertical line as a delimiter (heading | subheading | sub-subheading) as part of the extraction process.
  • Documents with no table of contents: In such cases the Knowledge Extraction service uses a pre-trained machine learning model that identifies headings based on either font style or font size. In case of using font size, the heading hierarchy can also be derived.
  • The text is then formatted with uniform header and paragraph blocks.

Web Pages

The Knowledge Extraction service supports the following three different formats of FAQ web pages:

  • Plain FAQ Pages with linear question-answer pairs
  • Pages with question hyperlinks that point to answers on the same page.
  • Pages with question hyperlinks that point to answers on a different page.

Extraction of certain FAQs on the webpage fails under the following conditions:

  • The question text is split between multiple HTML tags on the FAQ page.
  • The tag applied to the answer is neither the child nor the sibling of the extracted question as per the HTML DOM structure.
  • The question doesn’t have a hyperlink to the answer (applies to FAQs with hyperlinks)
  • When the questions hyperlink to the answer, but the question statement isn’t repeated above the answer (applies to FAQs with hyperlinks)

The extraction of the entire FAQ page fails if the page consists of more than one FAQ page types mentioned above.

Menu