Highlighted Updates
Explore many more feature updates organized by Dataiku capability below.
Generative AI
Generative AI capabilities in Dataiku
LLM Fine Tuning
If you need to refine your LLM to perform better on highly specific tasks or you have a use case that requires that you continually incorporate labeled data, Dataiku now enables you to fine-tune.
Users now have two options to take advantage of fine-tuning in Dataiku:
- Dataiku’s new fine-tune recipe is a unique low/no-code approach to fine-tuning that opens up fine-tuning to non-coders.*
- In addition, users can do this through Python code if preferred, with full flexibility & customizability in fine-tuning local LLMs from Hugging Face Hub and the ability to access state-of-the-art techniques from the open source community, all while benefiting from the LLM Mesh.
Updates to LLM Mesh
The latest LLM Mesh improvements introduce support for new models, connections, and features, including:
- Guardrails: specialized local models from Hugging Face for toxicity detection*
- LLM Mesh API to support function calling, other parameters, connecting external tools for advanced use cases (LLM agents, etc.)
- Support for Llama3 / Mistral (7B, 8*7B, Large) / Titan embeddings v2 / Cohere Command (R, R+) models through AWS Bedrock
- Support for Gemma in Hugging Face connection
- Add “Clear data” option to Knowledge Banks handler
*These features are currently only available as part of the Early Adopter Program.
RAG Updates
Chunking data is done prior to embedding in a vector store, and is a key step in training LLMs for use cases like RAG. These techniques and parameters can have a great influence over the end result for augmented chatbots.
Now, in Dataiku’s prepare recipe, there is a “split into chunks” processor. This new processor allows users to specify separators, visualize the chunks interactively, and apply post-processing steps to ensure chunks are separated as expected.
Universal Ops
DataOps capabilities and MLOps capabilities in Dataiku
Unified Monitoring Updates
Unified Monitoring provides a comprehensive view of all deployments and their overall health. Unified Monitoring for batch projects on the automation node has been updated with a new Govern card and status indicator. This allows users to get status on both batch projects and model endpoints fetching deployment status from Dataiku Govern without ever leaving the Unified Monitoring dashboard, to bring together the fullest and most complete view of ML project health all in a centralized view.
Data Quality Updates
Since its introduction in 12.6, Data Quality has continued to receive improvements. Updates to Data Quality in 13.1 include multi-column support on all column-based rules, the ability to publish data quality statuses to dashboards, and template updates.
Data Prep
Data Prep capabilities in Dataiku
Multi-Row Formula
Users can now utilize an optional offset argument to existing functions used to access a column value. The offset argument is available in the Prepare recipe only, in all processors that support Formula. This feature can be used in use cases such as iterative calculations or auto increment ID.
Build Flow Zones from Scenario
Users now have the option to build everything within a Zone as a Scenario step rather than building individual items within that zone.
Visualization & Data Storytelling
Visualization capabilities in Dataiku
Dashboard Enhancements
Improvements to dashboards include UX and performance enhancements such as page and title settings, as well as performance improvements when loading visible tiles.
Charts Enhancements
Charts have received the following enhancements:
- Median/percentile aggregation for numeric columns
- New gauge chart type
Governance
Governance capabilities in Dataiku
Dataiku Govern Improvements
Dataiku Govern now has enhanced auditability with a global instance timeline, which is a centralized view of all Dataiku Govern items events. This global timeline is accessible to all users and can be filtered based on multiple conditions.. In addition, new custom filters enable users to filter on more metadata, including conditional formatting and with and/or imbrication
Dataiku Solutions
New Dataiku Solutions have been added:
- Clinical Site Intelligence: Leverage insights from clinical trial studies around the world to facilitate new study competitive intelligence and site analytics.
- Store Segmentation: Group stores with similar characteristics based on demographic data and/or category sales data in an effort to develop a bespoke approach to optimizing operations.
- Customer Satisfaction Reviews: Analyze your customer-rated reviews. Extract valuable insights from a large amount of text data. Uses the LLM Mesh.
- Survival Analysis Plugin: Survival analysis is an advanced statistical technique. This plugin creates new recipes to support statistical tests and survival probability estimation.
- Dataiku Answers Updates: Updates to the Dataiku Answers plugin v1.2.4 include UI updates, automatic knowledge bank usage, the ability to document metadata context, WT1 events specific to Answers, and bug fixes.
* This feature is currently only available through the Early Adopter Program.
Find all details in our release notes.