Chatbot Overview
Conversational Bots
Intents & Entities
Intelligent Bots
Kore.ai's Approach
Kore.ai Conversational Platform
Bot Concepts and Terminology
Natural Language Processing (NLP)
Bot Types
Bot Tasks
Starting with Kore.ai Platform
How to Access Bot Builder
Working with Kore.ai Bot Builder
Building your first Bot
Getting Started with Building Bots
Using the Dialog Builder Tool
Creating a Simple Bot
Release Notes
Latest Updates
Older Releases
Bot Builder
Creating a Bot
Design
Develop
Dialog Task
Working with User Intent & Dialog Node
Working with Entity Node
Supported Entity Types
Working with Composite Entities
Supported Time Zones
Supported Colors
Supported Company Names
Working with Message & Confirmation Nodes
Working with Service Node
Implementing Custom Authentication
Enabling 2-way SSL for Service nodes
Working with Script Node
Working with Agent Transfer Node
Working with WebHook Node
Defining Connections & Transitions
Managing Dialogs
Prompt Editor
Context Object
Session and Context Variables
Action & Information Task
Working with Action Tasks
Working with Information Tasks
Establishing Flows
Alert Tasks
Working with Alert Tasks
Managing Ignore Words and Field Memory
Knowledge Tasks
Building Knowledge Graph
Importing and Exporting Bot Ontology
Knowledge Extraction
Natural Language
Overview
Machine Learning
ML Model
Fundamental Meaning
Knowledge Graph Training
Traits
Ranking and Resolver
NLP Detection
NLP Settings and Guidelines
Bot Intelligence
Dialog Management
Context Management
Amend Entity
Multi-Intent Detection
Default Conversations
Channel Enablement
Test & Debug
Talking to Bot
Utterance Testing
Batch Testing
Recording Conversations
Publishing your Bot
Analyzing your Bot
Overview
Dashboard
Conversation Flows
Bot Metrics
Advanced Topics
Bot Authorization
Language Management
Collaborative Development
IVR Integration
Universal Bots
Defining
Creating
Customizing
Enabling Languages
Smart Bots
Defining
Sample Bots
Github
Asana
Travel Planning
Flight Search
Event Based Bot Actions
Sentiment Analysis
Tone Analysis
Sentiment Management
Bot Settings
Bot Functions
General Settings
PII Settings
Customizing Error Messages
Bot Management
API Guide
API Overview
API List
SDKs
SDK Overview
SDK Configuration
SDK Security
SDK App Registration
Kore.ai Web SDK Tutorial
Message Formatting and Templates
Mobile SDK Push Notification
Using the BotKit SDK
Installing the BotKit SDK
Events for the BotKit SDK
Functions for the BotKit SDK
BotKit SDK Tutorial – Agent Transfer
BotKit SDK Tutorial – Flight Search Sample Bot
Using an External NLP Engine
Web Socket Connect & RTM
Bot Administration
Bots Admin Console
User Management
Managing Your Users
Managing Your Groups
Role Management
Bots Management
Enrollment
Inviting Users
Sending Bulk Invites to Enroll Users
Importing Users and User Data
Synchronizing Users from Active Directory
Security & Compliance
Overview
Using Single Sign-On
Cloud Connector
Billing
Bot Store
Overview
Creating a Kore.ai Bot Account
Adding a Kore.ai Bot
Choosing a Channel for a Bot
Interacting with a Kore.ai Bot
Setting Up Web Service Alerts
Setting Up RSS Alerts
Setting Up the Kore.ai Webhook Bot
Custom Kore.ai Bots
Bots for your Customers FAQs
Bots for your Workforce FAQs
Adding Bots
Contacting Kore.ai Support
Setting Up Filters
Bot Store Settings
  1. Home
  2. Docs
  3. Bots
  4. Bot Building
  5. Knowledge Task
  6. Knowledge Extraction

Knowledge Extraction

The Knowledge Graph Extraction Service enables you to effortlessly move your enterprise’s existing Frequently Asked Questions- FAQ content – into bot ontology.

The feature supports the extraction from unstructured content such as web pages and PDF documents as well as from structured content such as CSV files.

After completing the extraction, you can edit the question and answers using an easy-to-use interface and organize them under the relevant Knowledge Graph nodes.

The Knowledge Extraction Process

Moving data using the Knowledge Extraction (KE) Service to the Knowledge Graph involves the followings steps:

  • Step 1 Extracting: Extract the existing FAQ content from structured or unstructured sources of question-answer data such as PDF, web pages, and CSV files. This extraction can be done before or after creating a Knowledge Graph for the bot.
    Note: The KE service supports specific content structure for each source type. Refer to the Supported formats section for details.
  • Step 2 Editing: Upon successful data extraction, you can edit the questions and answer text before moving it to the Knowledge Graph.
  • Step 3 Moving: You can add data into a Bot before or after creating a Knowledge Graph (KG). If you try to add the extracted content to a KG before it exists, the bot creates automatically create one with the bot’s name.
    The Knowledge Extractor provides two options to add the extracted content to the Knowledge Graph:

    • Add to Knowledge Graph: moves the selected questions to the root node of the Knowledge Collection. Developers can typically use this option when the required term is not yet added to the KG or when bot doesn’t yet have a KG. Selecting this option adds all the questions at the root node.
    • Add to Specific Term: If the bot already consists of a Knowledge Collection, selecting this option opens the bot ontology. You can expand the required nodes to drag-drop the selected content.

Extracting FAQs from a Website

  1. Open the bot into which you want to extract the content and click the Knowledge Collection tab.
  2. At the bottom of the Knowledge Collection window, under Extracts section, click Extract from URL.
  3. Enter a Name for extraction.
  4. Enter the URL of the page, and then click Proceed.
  5. Once the extraction is complete, the page with the Success status appears.

Extracting FAQs from a CSV or PDF Document

  1. Open the bot into which you want to extract the content and click the Knowledge Collection tab.
  2. From the Extracts section, from the tab Extract from file, click Browse.
  3. Browse to the file, select it, and then click Proceed.
  4. Once the extraction is complete, a page with the Success status will be displayed.

Editing the Extracted Content

  1. Open the bot and click the Knowledge Collection tab.
  2. At the bottom of the Knowledge Collection window, click the Knowledge Extraction section. It opens the list of extractions.
  3. Click the name of a successful extract whose content you want to edit.
  4. Hover over the question-answer pair to modify it and click the edit icon.
  5. Make the necessary changes and click Save.

Move Selected Question-Answers to the Knowledge Graph

  1. Open the bot and click the Knowledge Collection tab.
  2. Select the node to which you want to add these Question-Answers. Click Add from extraction. It opens the list of successful and failed extractions.
  3. Click the name of a successful extract whose content you want to move.
  4. Select the checkboxes next to the question-answer pairs that you want to move and then click Add.
  5. Select one of these options:

    1. Add to Knowledge Graph: moves the selected questions to the root node of the Knowledge Collection. If the bot doesn’t have a knowledge collection, it gets created automatically with the same name as that of the bot.
    2. Add to Specific Term: If the bot already consists of a Knowledge Collection, selecting this option opens the bot ontology. You can expand the required nodes to drag-drop the selected content.

Once you move a question-answer pair from the extract to the knowledge collection, you cannot move it again. The platform throws a duplicate error when you try to move a question from the extract that’s already present in the collection. You can make any changes to the moved content from the knowledge collection. However, if the question is modified or removed in the KG, then the developer would be allowed to add it again to the KG.

Supported Formats

The Knowledge Extraction service, currently in Beta version, supports extracting FAQs from only from supported CSV, PDF and URL formats. As we’re continuously developing and improving more evolved mechanisms to extract FAQs from various types of content with maximum accuracy, there are a few constraints to the feature functionality that we have noted against each data source below.

CSV

The service interprets the text in the first column as questions and that in the second column as answers. It ignores any headers and the text present in the other columns. The file shouldn’t have any headers.

PDF

The Knowledge Extraction service can process the content in a PDF and convert into question-answer pairs. It can efficiently extract contents from documents that have a table of contents and the text is formatted with uniform header and paragraph blocks. The service has limitations in extracting the content from documents with nested headers and multi-format text.

Web Pages

The Knowledge Extraction service supports the following three different formats of FAQ web pages:

  • Plain FAQ Pages with linear question-answer pairs
  • Pages with question hyperlinks that point to answers on the same page.
  • Pages with question hyperlinks that point to answers on a different page.

Extraction of certain FAQs on the webpage fails on these conditions:

  1. The question text is split between multiple HTML tags on the FAQ page.
  2. The tag applied to the answer is neither the child nor the sibling of the extracted question as per the HTML DOM structure.
  3. The question doesn’t have a hyperlink to the answer (applies to FAQs with hyperlinks)
  4. When the questions hyperlink to the answer, but the question statement isn’t repeated above the answer (applies to FAQs with hyperlinks)

The extraction of the entire FAQ page fails if the page consists of more than one FAQ page types mentioned above.

Menu