Running a Successful Data Science POC

A successful data science proof of concept (POC) should prove the larger value of a system, ensuring it’s aligned with forwarding the company’s longer-term strategic objectives. Discover how a typical POC should be optimized for the data science space.

A proof of concept (POC) is a popular way for businesses to evaluate the viability of a system, product, or service to ensure it meets specific needs or sets of predefined requirements. Successful POCs should prove the larger value of a system, ensuring it’s aligned with forwarding the company’s longer-term strategic objectives.

When it comes to the evaluation of data science solutions, POCs should prove not just that a solution solves one particular, specific problem, but that a system will provide widespread value to the company. Data science POC projects should show that they are capable of bringing a data-driven perspective to a range of the business’s strategic objectives.

6 mistakes to avoid when running a data science POC infographic

The 7 Key Steps to a Valuable Data Science POC

Here are the seven essential elements to keeping the project on track for an efficient, effective, and most of all successful data science POC.

1. Choose a real, concrete use case. The first, and possibly most important, step to running a successful POC for data science is choosing a use case. Without this, a POC simply can’t exist. Start with a list of critical business issues from which to choose, possibly soliciting feedback and ideas from teams across the company for a variety of use cases. Evaluate the current processes, whether the use of data science and machine learning techniques could specifically help improve them, and if so, how?

2. Restrict to a reasonable time frame. In general, a maximum of 60 days is sufficient for a data science POC project because it allows for proper evaluation without taking too much time away from staff who are balancing other ongoing work and projects.

3. Clearly define deliverables. Of course, one of the most important factors in restricting a data science POC to a reasonable timeframe is the presence of clear deliverables. Without them, the process can drag on, as no one is really sure what to consider done or what to consider a success (or when).

4. Involve the right people. To run a successful, efficient data science POC project, all relevant stakeholders will need to be involved from all parts of the organization: the data scientists and/or analysts, of course, will necessarily be connected the most to the project; but also the IT team, any business teams involved with or impacted by the results, as well as end users of the solution, should all be involved.

Watch Video

5. Consider Production. Data science and data projects shouldn’t happen in a vacuum, so neither should a data science POC. The goal of a POC isn’t just to complete one simple project. Rather, a successful POC will allow the platform to continue to deliver business value even after the POC is over. In order to deliver that value, projects (including the use case for the data science POC) need to actually go into production and not get stuck in a prototyping or sandbox phase.

6. Ensure autonomy. Often, a data science POC affords companies the opportunity to work with experts in the field with lots of experience in getting data projects off the ground and into production. No matter how simple a product seems, working with experts (likely the product’s sales and/or technical teams) comes with the added advantage of learning from other companies on what works and what doesn’t.

7. Be agile but focused. In the data science POC process, the best results come from teams that are both agile — eager to pivot in new directions they didn’t foresee — but also focused and not straying too far away from the original problem when interesting insights inevitably come up.

Watch Video

What to Expect During a POC With Dataiku

With Dataiku, the work on your data science POC will not be throw-away; we take pride in working with customers to make sure each POC is successful and brings real value. That is part of why we strongly suggest that teams choose a test case that is tied to strategic goals and has real and measurable results. After the data science POC project, you will be left with new learnings and insights, a new model, and/or a new data product (or three!) that you can continue to use even after the POC is done.

Why Enterprises Need Data & AI Platforms

get the white paper

An important part of bringing value means ensuring that team members use Dataiku and aren’t afraid of losing work that they put into a project if the POC doesn’t ultimately end in a purchase. Therefore, at the end of a data science POC:

  • Dataiku reverts to the free edition, so projects are still accessible and work is not lost (note, however, that the free edition means no production capabilities and some limited reduction in functions).
  • All of the code used can be extracted by the admin.
  • The GUI models can be seen and still used by the admin.
  • The non-prep recipes can still run on several connections.

Ultimately, adhering to the seven essential components of a successful POC using a centralized platform such as Dataiku will mean a faster decision and an easier transition from POC to implementation.

Operationalization: From 1 to 1000s of Models in Production

The ability to efficiently operationalize data projects is what separates the average company from the truly data-powered one.

Read more

Go Further

Dataiku for Analytics Leaders

Dataiku offers data leaders the possibility to harness a transparent yet structured environment.

Learn More

Full Elasticity as the Future of AI

There is no question that elasticity, including on-demand compute resource management, is the future of Enterprise AI.

Learn More

Start the Enterprise AI Journey with Churn Prediction

Churn prediction is a relatively quick win with machine learning, and its potential value to an organization is staggering.

Learn More

Data Democratization Through Self-Service Analytics

Data-powered organizations give everyone (whether technical or not) the ability to make decisions based on data via a self-service analytics program.

Learn More