Entities are the fields, data, or words the developer designates necessary for the chatbot to complete a task: a date, time, person, location, description of an item or a product, or any number of other designations. Through our NLP engine, the bot identifies words from a user’s utterance to ensure all available fields match the task at hand or collects additional field data if needed. The goal of entity extraction is to fill any holes needed to complete the task while ignoring unneeded details. It’s a subtractive process to get just the necessary info – whether the user provides all at once, or through a guided conversation with the chatbot. The platform supports identification and extraction of 20+ system entities out of the box. Read more.
You can use the following two approaches to train the bot for identifying entities in the user utterance:
- Named Entity Recognition(NER) based on Machine Learning.
- Entity pattern definition and synonyms.
The entity training is optional, and without it, Kore.ai can still extract entities. But the training helps to guide the engine as to where to look in the input. Kore.ai still validates the value found and moves on to another location if those words are not suitable. For example, if a Number entity pattern was for * people but the user says “..for nice people,” the bot understands that “nice” is not a number and would continue searching.
The best approach to train entities is based on the type of entity as explained below:
- Entity type like List of Items (enumerated, lookup), City, Date, Country do not need any training unless the same entity type is used multiple types in the same task. If the same entity type is used in a bot task, use either of the training models to find the entity within the user utterances.
- When the entity type is String or Description, the recommended approach is to use Entity patterns and synonyms.
- For all other entity types, both NER and Patterns can be used in combination.
Entity Training Recommendations
- Use NER training where possible – NER coverage is higher than patterns.
- NER approach best suits detecting an entity where information is provided as unformatted data. For entities like Date and Time, the platform has been trained with a large set of data.
- NER is a neural network based model and will need to be trained with at least 8-10 samples to work effectively.