The Entity Type provides the NLP Interpreter with the expected type of data from a user utterance to enhance recognition and system performance. The Kore.ai NLP interpreter extracts the entity from the user utterance. If the user does not enter a required entity, you can define a Bot Response node to prompt the user to provide the entity. For more information, see the Working with the Bot Response Node.
The following Entity Types can be specified for an Entity node.
Address
Captures addresses written in the standard US and Germany address formats, for example, 200 E Main ST Pheonix AZ 85123 USA. The complete address is captured as a string: “200 E Main ST Pheonix AZ 85123 USA.”
"entities": { "AddressEntity": "200 E Main ST Pheonix AZ 85123 USA" }
For other country addresses, the platform captures strings that end with a recognizable city or country name. For more details, refer to the City entity.
Airport
Captures airport details with the following inputs:
- city name,
- airport name,
- IATA code,
- ICAO, or
- abbreviations of US cities.
Airport details are returned as JSON entity with the elements shown below:
"AirportEntity": {"IATA": "LHR", "AirportName": "London Heathrow Airport", "City": "London", "ICAO": "EGLL", "Latitude": "51.4775", "Longitude": "-0.461389" }
We use https://github.com/opentraveldata/opentraveldata for all the airport details.
Input | Description | Examples |
---|---|---|
City name |
Identifies the airport name from the city name in the user utterance. If the city has multiple airports, shows the list of airports to choose from. | Utterance: Flying to Los Angeles Response: The airport you entered seems to be ambiguous. Tell me the option you would like to choose. <Names of five airports in Los Angeles> |
Airport name | Identifies the airport name from full airport name or partial name with the prominent keyword. | Utterance: Flying to Heathrow Captured: London Heathrow Airport with the necessary details in the bot. |
IATA | Identifies airport names by the International Air Transport Association (IATA) codes. | Utterance: Flying to LHR Captured: Details of the London Heathrow Airport |
ICAO | Identifies International Civil Aviation Organization (ICAO) codes. | Utterance: Flying to EGLL Captured: Details of the London Heathrow Airport |
Abbreviations of cities | Identifies city abbreviations that are listed in www.geonames.org | Utterance: Flying to LA Response: The airport you entered seems to be ambiguous. Tell me the option you would like to choose. <Names of five airports in LA> |
Attachment (Image / File)
The user can attach a file, image, or email of up to 25 MB. The bot returns the attachment description that you enter as a string.
"entities": { "AttachmentEntity": "send" }
Note: Currently, the attachment entity is supported only for the following channels – Facebook, Twitter, Web/Mobile, and Slack.
City
The name of a city in an utterance such as “What is the temperature in New York“. The bot captures any city name with over 5000 population in the form of a string. We use www.geonames.org for all the city details.
"entities": { "CityEntity": "New York" }
Country
Captures the name of a Country from user utterance such as “What is the capital of United States of America“.
Country details are returned as JSON entity with the elements shown below:
"CountryEntity": { "alpha3": "USA", "alpha2": "US", "localName": "United States of America", "shortName": "United States", "numericalCode": 840 }
See here for complete list of countries https://www.nationsonline.org/oneworld/country_code_list.htm.
Element | Description | Examples |
---|---|---|
alpha3 | The three letter code of the country | USA, GBR, or IND |
alpha2 | The two letter code for the country | US, GB, or IN |
localName | Name of the Country | United States of America, United Kingdom, or India |
shortName | Short name | United States, United Kingdom, or India |
numericalCode | United Nations, used numerical code M49 for countries | 840, 826, or 356 |
Company Name or Organization Name
Captures the name of a company from user utterances such as “Nearest branch for Amazon”. The value for Company Name is returned (Amazon) as a string. See Supported Companies list.
Apart from the supported companies, the bot recognizes the words starting with a capital letter and followed by these suffixes as a company type: Inc, Incorporated, Corp, Corporation, Group, Ltd, Limited, Co, Company, LP, LLP, LLLP, LLC, PLLC.
"entities": { "OrganizationEntity": "amazon" }
Color
Captures the name of the of color from a user utterance, for example, Set the status to green. Returns the value for Color as green as a string. See Supported Colors list.
"entities": { "ColorEntity": "green" }
Currency
Captures the amount and type of currency from the user utterance, for example, “This handbag is priced at 200 dollars” – where 200 is the amount and USD is the currency.
This entity type recognizes:
- Full currency names (Dollar, Rupees, Indian national rupees, Dinar),
- Currency symbols ($, S$, £),
- standard currency abbreviations (INR, USD), and
- commonly used slang for currencies (Buck, Nickel, Dime, Quid, Loonie, Toonie, Benjamin, Jackson, Hamilton.)
"CurrencyEntity":[ { "code": "SGD", "amount": 20 }
Custom
Define a regular expression to validate the user input in the Regex field displayed.
For example, enter: [a-zA-Z]{3}[-]\d{4}
to return a sample response as: {"regex":"NLP-1234"}
For more information, see Regex Expressions.
Composite
Composite entities can be used to capture multiple entity values in one entity.
For example, consider the sales inquiries to car sales. Typical queries can be of the form: ‘I am interested in Tesla Model S 2018 model’ or ‘What would a red Tesla 2010 model cost’ or ‘Tell me about Tesla Model S’.
As you can see, the bot typically needs to process a combination of details like Make, Model, Year and Color to respond to those queries.
These scenarios are taken care of by the Composite Entity Type.
Refer here to know more about Composite Entity Types.
Date
Captures a date mention from a user utterance. For example, “Book a flight on the 10th of October”, returns the value for Date in ISO8601 date format as YYYY-MM-DD.
The bot recognizes all possible ways and formats of dates, like:
- formatted dates like YYYY-MM-DD, DD-MM-YYYY, DD-MM-YY, YYYY/MM/DD, DD/MM/YYYY, DD/MM/YY, YYYY.MM.DD, DD.MM.YYYY, DD.MM.YY
- all number dates like YMD and DMY for 20180518 and 09102013
- formatted dates with space separators like YYYY MM DD. dd/mm yyyy, dd-mm, dd-mm-yyyy, dd-mm-yy, mm-dd, dd / mm / yyyy, dd . mm . yyyy, ddmm yyyy, mmdd
- named months like yyyy/dd/monthNames or yyyy-dd-monthNames or dd.monthNames.yyyy 2018/28/Dec or 2018-28-Dec or 28.Dec.2018
- absolute dates related to now like today, tomorrow, yesterday, tonight, this evening, this afternoon, day after tomorrow, day before yesterday, yesterday morning, tomorrow night, tomorrow for 1 hour, 3 days ago, 24 hours ago, in 3 days, 2 months hence, this day next month last year, next June, June 26 of year after next, in a week, 2 weeks ago, 22nd of this month, Next month this day, Next 25th, This month 30th, 27th of this month, 3rdmonth, 2nd of month
- named dates like Christmas Day, Christmas Eve, Memorial Day this year, Thanksgiving 2018, last Thanksgiving, week after Thanksgiving, Passover, day before new year, day after Christmas
- relative dates from absolute like 2 more days from tomorrow, 3 days after July 4th, 3 days from now, 5days from today, Need two more days, in 2days
- weekdays like Saturday, coming Monday, Sunday, Saturday, next weekend, First Saturday of the upcoming year, First Sunday of the upcoming month, first Saturday of next month, First Sunday of next year
"entities": { "DateEntity": "1982-04-13" }
Date Time
Captures a date grouping along with time in a user utterance.
For example, “Book a flight on the 10th of October at 6 pm”, returns the value for Date Time in ISO8601 date format as YYYYY-MM-DDThh: mm: ss.sTZD.
The bot recognizes all possible ways and formats of expressing dates and time.
"entities": { "DateEntity": "2017-10-10T18:00:00+05:30" }
Date Period
Captures start date and end date from the user input, for example, Book the hotel for five days starting May 5. If the user input doesn’t include one or both of the dates, the bot prompts the user to provide the necessary input.
Note: Unlike other entities, Date Period entities allow you to enter two sets of user and errors prompts:
- User and Error Prompts for From Date
- User and Error Prompts for To Date
The following table lists how the entity works in different scenarios:
Input Type | Bot Behaviour |
---|---|
Doesn’t include both From and To dates [e.g., Book hotel] | Shows User Prompts for From Date to the user |
Includes either From or To date [e.g., Book a hotel from 15th Aug] | Shows User Prompts for From Date or User Prompts for To Date based on which is missing from the input |
Includes implicit reference to From Date and duration [e.g., Book a hotel for five days starting from Tuesday] | Determines both dates |
Includes From Date and duration [e.g., Book a hotel for five days from 15th Nov] | Determines both dates |
Includes From Date and To Date [e.g., Book a hotel from 5th to 10th] | Determines both dates |
Description
Captures statements or paragraphs of text from the user utterance. The value for Description is returned as a string and can include wild characters.
"entities": { "Description": "text here" }
Captures email address from the utterance. For example, “Send an email to help@koremessenger.com” returns the value of Email as a string.
"entities": { "Email": "help@koremessenger.com" }
List of Items (enumerated)
Display a list of values to the end-user. To define the list type, click the Settings icon located to the immediate right of the Type field to open the List of items (enumerated) Setup page to define one of the following list types.
- Static List – Enter the Display Name for the key, the Value for the key, and optionally, enter synonyms for the key. Set up Auto Correction value for the user inputs.
- List from context – Define a context variable to use for this item in the following fields:
- Specify Context Variable to Use – Defines the context object type, for example, EnterpriseContext, BotContext, UserContexts, or session variables such as context.entities. Enter
context.
and then select a context object type. - Display Name Key – The name displayed to the end user.
- Value Key – The key that represents the value of the item in the list.
- Synonyms Key – Enter one or more synonyms for the key.
- Specify Context Variable to Use – Defines the context object type, for example, EnterpriseContext, BotContext, UserContexts, or session variables such as context.entities. Enter
- Auto Correction– Set up auto-correct thresholds for the LOV entity type so that it not only accepts exact matches but also closest utterances with small variations. For example, let’s consider that a list value called Apple for which a typo such as “appel” could be accepted based on your threshold settings. The Auto-Correction setting works in the following way:
- The bot identifies the number of letters to be changed (inserts, deletes or replaces) in user input to match it to a value in the list
- The number is converted to a percentage of the total number of letters in the input
- The list value with the highest similarity is considered as input if the score is greater than or equal to the configured percentage.
Spell correction doesn’t apply to dictionary words or alphanumeric inputs.
For the list to be presented to the user, you need to set the Display List of Values to Yes. This would present the list of values to the user in the channel-specific format, you may want to use a template as per your requirement (see here for more).
List of items (lookup)
Display a list of values to the end-user. To define the lookup list, click the Settings icon located to the immediate right of the Type field to open the List of items (lookup) Setup page.
You can define either a Static List or a Remote List.
Static List
Use Static List to define the entity values as one of the following list types:
- JSON tab – Enter a list of key/value pairs and synonyms, for example:
[{ "title": "United States", "value": "US", "synonyms": ["united states", "USA", "US", "U.S.A", "America"] }, { "title": "John F. Kennedy International Airport", "value": "JFK", "synonyms": ["John F. Kennedy International Airport", "New York International Airport", "JFK"] } ]
- Editor tab – Enter the Display Name for the key, the Value for the key, and optionally, enter Synonyms for the key.
- Upload File – Click to navigate to and select a JSON formatted file list, or a .csv file formatted list of key/value pairs, for example,
For the list to be presented to the user, you need to set the Display List of Values to Yes. This would present the list of values to the user in the channel-specific format, you may want to use a template as per your requirement (see here for more)
Remote List
Remote List can be used when the entity extraction needs to be done by an external service due to security restrictions or any other reasons. This can also used to handle large data.
The steps would involve:
- Define the service call: You can set up a service call similar to how a service node is currently set up. You can set headers, body (for POST), etc.. (see here for more).
The external service being invoked should have a provision to accept and handle the user utterance data that the platform would be populating. Thecontext.inputData
object with the following fields will be used for that purpose:- input – array containing the list of inputs received from user for the current dialog;
- usedUp – index form of words that are already used for other entities or intents. The format is x-y-z where
- x represents the sentence/utterance index (0 till n)
- y represents the start index of the used up word within the x utterance and
- z represents the end index of the used up word;
- sentenceindex-x-x represents no used up words in that sentence;
- If multiple words are used-up in a single sentence these should be entered as comma-seperated values;
- isMultiItem – flag should be set if multiple values are expected from the service call
"inputData": { "input": [ "get account" ], "usedUp": [ "0-x-x" ], "isMultiItem": false }
- Map Response from the service call with the following fields:
- the context variable that will hold the response data from the service call. This should be in an array format.
- Display Key Name – name to be used to refer to this field, this name is used when interacting with the users for example in disambguation scenario. This can be accessed using
{{context.entities.<entity-name>.title}}
. - Value Key – the field name in the response body from the service call that holds the value, the entity will be assigned this value. This can be accessed using
{{context.entities.<entity-name>.value}}
. - Synonyms Key – the field containing the synonyms for this field, if any. This is the value that the user might refer to for example in response to disambiguation question. This can be accessed using
{{context.entities.<entity-name>.synonym}}
. - Matched Word Index to indicate the words in the inputData that were used for entity extraction (in the same format as the usedUp value in the context.inputData object). This will be used by the platform to mark the word as used in the user utterance.
Flow: The platform will:
- Populate the
context.inputData
with the values as mentioned above. - Make the service call to fetch the entity values passing the values as configured in the service call
- Use the values returned as per Response Mapping
- Handle the following exceptions
- In case the response from the Service call is empty or not in expected format, the User Prompt settings for the entity would be used to prompt the user for input.
- In case of ambiguity, for example when the service returns multiple values when one is expected, the user will be prompted to choose one from the list of values returned by the service.
Location
Captures the location details of a city or state from a user utterance.
For example, in Bellagio, Las Vegas the entity captures the location details of Las Vegas. The entity returns the location of the object with address and coordinates as a JSON response.
"Location": { "formatted_address": "Las Vegas, NV, USA", "lat": 36.1699412, "lng": -115.1398296 }
Number
Captures a number from a user utterance. For example, “Book a room for 16 people”. In this example, the value 16 is returned as the Number.
The Bots Platform recognizes the spelled out numbers and also standard abbreviations such as 1M. Consecutive number words are combined into one number, for example, one two three becomes 123.
"entities": { "NumberEntity": 16 }
Person Name
Captures the full name of a person from a user utterance.
For example, “Send an email to John Smith”, where John Smith is identified as Person Name.
Kore.ai Bot Platform assumes that the first word in the user utterance with capital letters as the first name along and the next two words in camel case as a part of the name.
For example, if the user utterance is “I want to talk to John Smith,” it recognizes John Smith as the name. If the utterance is “I want to talk to John smith immediately” it recognizes only John as the name.
"entities": { "PersonName": "John Smith" }
Percentage
Captures the percentage value from a user utterance.
For example, “The chance of rain today is more than 60 percent”, where 60 is the Percentage and is returned as a float value of 0.6 in a range of (0.0-1.0). It supports percent, percentage, and the % sign.
"entities": { "PercentageEntity": 0.6 }
Phone Number
Captures standard 10-digit or 12-digit telephone numbers.
For example, “Please call 4075551212″, the value for Phone Number is 4075551212 and is returned as a number.
"entities": { "PhoneNumber": "+4075551212" }
Quantity
Captures the quantity in an utterance with the following details from the user utterance:
- Type of the quantity (length, area, volume, etc.),
- unit of measurement (kilometers, square kilometer, cubic meter, etc.),
- the amount (100, 500, 1.5, etc.).
When you select Quantity entity type, you also need to select a unit type for the quantity and the default measure.
For example, for capturing volumes, select Volume as the Unity type and Milliliters as the Default unit. So, if a user utterance is “Add 500 ml of water”, the following JSON is returned
"Quantity": { "unit": "millilitre", "amount": 500, "type": "volume", "source": "500 ml" }
Bots platform identifies all these quantities and unites along with the standard abbreviations, codes, and symbols.
Type | Units |
---|---|
Length |
|
Area |
|
Volume |
|
Time |
|
Speed |
|
Pressure |
|
Energy |
|
Memory |
|
Weight |
|
Angle |
|
Age |
|
Temperature |
|
String
Works identical to the Description entity type but limited to one sentence.
There wont be any validations done on the user utterance for string entities, unless trained. Hence this entity type should be used as a last resort when your requirement is not met with any of the platform supported entity types.
Time
Capture time in a user utterance.
For example, “Set my alarm for 6 am”, returns the value of Time in ISO 8601 time format as hh:mm:ss.sZD.
It recognizes the following denotations:
-
- am, a.m., AM, pm, p.m., PM, P.M.
- Numbers spelled out, for example, Six AM.
- Morning and evening, for example, Six in the evening.
"entities": { "TimeEntity": "T06:00:00+05:30" }
Time Zone
A time zone. “Eastern standard time” converts the timezone into GMT and stores the resulting value.
For example, if you type EST, it is stored as -6:00.
Bots Platform recognizes the standard time zones.
"entities": { "TimeZoneEntity": "-06:00" }
URL
Captures a web URL from the utterance. The bot recognizes all standard formats of URLs.
For example, “Visit our website: www.kore.ai”. The value for URL is returned as a string.
"entities": { "URLEntity": "www.kore.ai" }
Zip Code
Captures a US zip code from the user utterance, for example, “What is the weather for 32746?” The value for Zip Code is 32746 and is returned as a string.
"entities": { "ZipcodeEntity": "32746" }