|License||Apache Software License|
Unstructured text hides enormous amounts of valuable information, but it is hard to process it automatically. MeaningCloud’s Text Analytics plugin enables you to include NLP processing in your Dataiku flow, allowing you to take advantage of any unstructured texts, giving them a structure, extracting its meaning and combining it with other data sources.
MeaningCloud’s plugin for Dataiku provides the following recipes:
- Language Detection: detect the dominant language of a text (language name and ISO 639 code) using MeaningCloud’s Language Identification API.
- Sentiment Analysis: analyze the sentiment polarity, subjectivity, irony, and emotional agreement of a text using MeaningCloud’s Sentiment Analysis API.
- Topic Extraction: extract Named Entities (people, organizations, etc.), concepts, money expressions and quantities from a text using MeaningCloud’s Topics Extraction API.
- Text Summarization: summarize a text according to a specified number of sentences using MeaningCloud’s Summarization API.
- Deep Categorization: assign one or more categories to a text using advanced rule-based language models with MeaningCloud’s Deep Categorization API.
How to use
Once you have the plugin installed, there’s only one thing you need in order to analyze your data: a MeaningCloud license key. It’s a very easy process:
- If you haven’t already, create an account in MeaningCloud. You will be sent a validation email.
- Copy your license key from the subscription section.
One of the components provided in the plugin is a parameter set. This parameter set allows you to define a connection to the MeaningCloud APIs which you will be able to use in any of the recipes used. This parameter can be edited by accessing the settings of the plugin. Once there, if you click API configuration on the sidebar on the left, you will see the following:
As the parameter “Users can provide values for the preset directly when using them” indicates, it’s not mandatory to define the connection details here, it can also be done directly in the recipe settings when added to a flow. Defining the credentials in a preset is recommended, as it will help you centralize the credentials used.
If we add a new preset, there are four fields to fill in:
- Name of the preset
- MeaningCloud license key, where you will have to paste the key you copied in the subscriptions section.
- MeaningCloud server, set by default to our SaaS environment, but it can be modified for users with on-premises deployments.
Once you have set your API configuration, you can start analyzing text right away!
You can read more about the analyses provided in the plugin documentation. Do make sure to request access to any of our language or vertical packs you may want to use in your analysis.
Happy natural language processing!