Dataiku
Product
Plugins
Sentiment Analysis

Sentiment Analysis

This plugin provides a recipe to estimate the sentiment polarity of text data.

⚠️ Starting with Dataiku 14.0, this plugin has been deprecated. It is mostly superseded by Generative AI feature.

Plugin Information

Version	1.6.0
Author	Dataiku (Alex COMBESSIE, Hicham EL BOUKKOURI)
Released	2018-07
Last updated	2023-04
License	BSD 3-Clause License
Source code	Github
Reporting issues	Github

With this plugin, you will be able to estimate the sentiment polarity (positive/negative) of text data in English.

Table of contents

How to set up
How to use

How to set up

Right after installing the plugin, you need to build its code environment.

Code Environment Creation

This plugin requires Python version 2.7, 3.6, or 3.7 to be installed on the machine hosting DSS.

How to use

Let’s assume that you have a Dataiku DSS project with a dataset containing text data in English. This text data must be stored in a dataset, inside a text column, with one row for each document.

Navigate to the Flow, click on the + RECIPE button and access the Natural Language Processing menu. If your dataset is selected, you can directly find the plugin on the right panel.

Plugin Recipe Creation

Sentiment analysis recipe

Estimate the sentiment polarity (positive/negative) of text data in English

Input

Text dataset: Dataset with a text column (in English)

Example of Text Dataset

Settings

Sentiment Analysis Recipe Settings

Fill Input parameters
- The Text column parameter lets you choose the column of your input dataset containing text data.
Choose Model parameters
- Choose your Sentiment scale.
  - Either binary (0 = negative, 1 = positive) or 1 to 5 (1 = highly negative, 5 = highly positive).
  - Default is binary.
- Choose whether to Output predictions as numbers and/or Output predictions as categories.
  - These parameters depend on the chosen Sentiment scale.
  - Default is yes to both.
- Choose whether to Output confidence scores for the predicted sentiment polarity.
  - Confidence scores are from 0 to 1.
  - Default is false.

Output

Output dataset: Copy of the input dataset with additional columns on predicted sentiment polarity

Example of Output Dataset

Troubleshooting

When installing / setting up the plugin’s code environment, getting an error Unsupported compiler -- at least C++11 support is needed: installing the python3-dev system package may help, see here.
When installing / setting up the plugin’s code environment, getting a fatal error: Python.h: No such file or directory: installing the python3-dev system package may help, see here.

Happy natural language processing!

Get the Dataiku Data Sheet

Learn everything you ever wanted to know about Dataiku (but were afraid to ask), including detailed specifications on features and integrations.

Get the Data Sheet