concept

Bundles and Packages

February 23, 2017

What Are Bundles and Packages?

In Dataiku Data Science Studio (DSS), development and production environments are separate to allow for testing and experimentation before deploying your models.

Project Bundles

The Design Node is your development environment. It’s the place to design your flow, build, test, and improve your data logic.

The Automation Node allows you to export your models to your production environment and use production data as inputs in your flow.

Deploying from the Design Node to the Automation Node is done at the project level with project bundles. Bundles contain the version of the flow you built in the Design Node. In other words, bundles are what allow you to run flows in your production environment.

Bundles are not simply a project export; bundles are not just used to move a project and all its content from one node to another. Rather, bundles move a project’s metadata and the data needed to replay the tasks that should be performed in the production environment.

See the Tutorial: Deploying to Production for a practical introduction to creating bundles and deploying to an Automation node.

Packages

The API Node allows you to expose predictive models through a REST API, and packages are the physical representation of different versions of a service that you deploy and activate on the API Node. A package contains all endpoints.

Each time you want to update a service (for example to use retrained models), you create a new package. The package is then transferred to the API node instances and activated. After activation, queries hitting the API node instance use the newer version of all models in the service.

See the Tutorial: Deploying to to Real-Time Scoring for a practical introduction to creating packages and deploying to an API node.