en

Dataiku 12

Dataiku 12 delivers new features to help data, domain, and IT experts keep AI under control.
Learn more in our release notes and in the crash course.

Generative AI Capabilities

 

Build real and safe Generative AI applications at enterprise scale

 

LLM Mesh

The LLM Mesh provides the components in Dataiku that empower IT to take control and help teams build safe, secure, enterprise-ready GenAI applications. With dedicated components for AI service routing, PII screening, LLM response moderation, performance and cost tracking, and auditing of entire application flows, you get maximum control while delivering the performance your business expects.

 

AI-Assistants

Supercharge data prep, project explanations, and coding tasks with AI assistants.

  • AI-Prepare

With Dataiku’s AI Prepare assistant, simply describe the transformation you want to apply and the AI assistant automatically generates the necessary data preparation steps. The ability to modify both your prompt and the resulting steps means you can prepare data faster than ever, yet still stay in complete control.

 

  • AI Explain

Unsure what’s happening in a data pipeline? Use AI Explain and powerful LLMs to automatically generate descriptions that explain Dataiku Flows or individual Flow Zones. Say goodbye to tedious documentation and reverse engineering cycles, and hello to your new favorite feature!

 

  • AI Code Assistants

Experience AI Code Assistants in Dataiku! Submit a code-related question or magic command through AI Code Assistant and receive context-enriched answers. This handy feature helps you write, explain, or debug code, comment and document your work, create unit tests, and more.

 

Prompt Studios & Recipe

With Prompt Studios in Dataiku, iteratively design and evaluate LLM prompts, compare performance and cost across models, and operationalize Generative AI in your data projects.

 

Retrieval Augmented Generation

Chatbots powered by generic LLMs can save time for common queries, but are unable to access recent data or critical internal documentation and so may miss out on key details.

By applying Retrieval Augmented Generation (RAG) and semantic search techniques in Dataiku, you can augment foundational LLMs with your own knowledge base to ensure chatbots provide the most relevant, accurate, and trustworthy information possible.

 

LLM-Powered NLP Recipes

Updating traditional NLP pipelines with modern Generative AI techniques is fast and easy with out-of-the-box, visual components. Dataiku offers no-code text recipes enhanced with pre-trained HuggingFace models and LLMs for text summarization, classification, sentiment & emotion analysis, and other common language tasks.

 

Increase Transparency

 

Help Everyone Understand AI Projects and Outputs

 

Auto Feature Generation

Improve efficiency and model performance by automatically generating new features from your existing datasets in a fraction of the time. Learn more in the Academy Knowledge Base article and in the reference documentation.

Auto Feature Generation

 

Universal Feature Importance

Model-agnostic visualizations for feature importance provide consistent & comparable explainability for models of all types. Learn more in the Academy Knowledge Base article and in the reference documentation.

Universal Feature Importance

 

Uplift Modeling

Apply this casual machine learning capability to measure cause & effect relationships and estimate an intervention’s impact on outcomes. Learn more in the Academy hands-on tutorial and in the reference documentation.

Uplift Modeling

 

Data Quality

With Dataiku, you gain a visual and permanent understanding of your data quality issues. With data quality rules, experience a new and improved way to proactively monitor data quality issues. Anyone from data engineers to analysts can quickly set up checks for certain parameters.

Data Quality

 

 

Standardize Components

 

Ensure Success With Best Practices and Approved Components

 

Help Center

Quickly access reference documentation, educational materials, and personalized dynamic content recommendations without ever leaving the screen you’re on. Learn more in the reference documentation.

Help Center

 

Data Catalog

Custom data collections and a central place to browse all of your organization’s connected data make it easier to discover high quality data to use in your projects. Learn more in the Academy Knowledge Base article and in the reference documentation.

Data Catalog

 

New Dataiku Solutions

Accelerate speed to value for industry proven use cases with pre-built project templates and plug & play applications. Recently developed solutions include Process Mining, Financial Forecasting, Credit Scoring, Batch Performance Optimization, Pharmacoviligance, Social Determinants of Health, and Product Recommendations. Learn more and browse additional solutions in the Dataiku solutions catalog, or explore projects directly by searching for “solution” in the Dataiku Gallery.

Data Catalog

 

 

Centralize Operations

Deliver Projects With Consistent Deployment and Management

 

 

Unified Monitoring

Unified Monitoring is the central hub for overseeing and monitoring pipelines and models developed and deployed across diverse platforms. It enables operators to extend coverage using External Models and Deploy Anywhere capabilities, and more -consolidating monitoring for deployments, projects, and APIs across various platforms.

 

Deploy Anywhere

Deploy models to other production environments like AWS SageMaker, Azure ML, and Google Vertex. This gives teams the flexibility to develop a model in one place but deploy in another, all while leveraging Dataiku as a central location to monitor, govern, and democratize access to all models.

 

External Models

You can now utilize existing AWS Sagemaker, Microsoft AzureML, or Google Vertex AI models in Dataiku. By integrating external models, you can leverage the benefits of traditional Dataiku models for models deployed externally.

Model Overrides

Ensure safe predictions by applying guardrails on models via business rules that enforce outcomes for known cases. Learn more in our Academy hands-on tutorial and in the reference documentation.

Model Overrides

 

Enhanced Deployment & Monitoring

Additional drift metrics and streamlined deployment & monitoring setup for more efficient operations. Learn more in the reference documentation.

 

New Governance Views

A Kanban board and synchronized deployment details allow operations and leaders to assess the status of all governed projects at a glance. Learn more in the reference documentation.

 

 

Schema Management and Flow Build Improvements

Enhanced options and default settings for building recipes, datasets, and Flow Zones, engine selection, and schema propagation make it easier than ever for designers and operators alike to refresh and maintain data pipelines. Learn more in the Academy Knowledge Base article and the reference documentation.

 

 


Find all details in our release notes.


For previous releases