Named Entity Recognition

This plugin provides a recipe to extract Named Entities from text data

This plugin provides a tool for extracting Named Entities (i.e. People names, Dates, Places, etc) which can be useful for extracting knowledge from your texts.

The plugin comes with a single recipe that extracts entities using one of two possible models:
– SpaCy: a faster but slightly less precise model. Another advantage of SpaCy is its support for many languages.
– Flair: a slower but more precise model for Named Entity Recognition.

Plugin Information

Version 1.2.0
Author Dataiku (Alexandre COMBESSIE and Hicham EL BOUKKOURI)
Released 2018-09-21
Last updated 2020-07-02
License Apache Software License
Source code Github
Reporting issues Github


Extract Named Entities

This recipe extracts named entities such as LOC (localisation) and PER (person) from your texts. The default model is SpaCy which is available for both English and French. To use a more precise (but slower) model, choose Flair.

How to use the recipe
Using the recipe is straightforward. Just plug in your dataset, select the column containing your texts and run the recipe!

Optionally, you can set some advanced settings. For example, you can choose Flair (only available in English) for a more precise extraction. You can also choose the format in which the extracted entities are presented: a separate column for each entity type (default) or a single column with a JSON containing all the entities.

WebApp Templates

SpaCy WebApp

This plugin offers a WebApp template for testing SpaCy’s NER model. To successfully run the webapp you will need to:

  • Create a Standard WebApp using the template, then enable a python backend.
  • Create a python code environment following these requirements:
  • In the WebApp’s settings page, select the previously created code environment and activate Bootsrap.


