Running a Successful Data Science POC

A successful data science proof of concept (POC) should prove the larger value of a system, ensuring it’s aligned with forwarding the company’s longer-term strategic objectives. Discover how a typical POC should be optimized for the data science space.

A proof of concept (POC) is a popular way for businesses to evaluate the viability of a system, product, or service to ensure it meets specific needs or sets of predefined requirements. Successful POCs should prove the larger value of a system, ensuring it’s aligned with forwarding the company’s longer-term strategic objectives.

When it comes to the evaluation of data science solutions, POCs should prove not just that a solution solves one particular, specific problem, but that a system will provide widespread value to the company. Data science POC projects should show that they are capable of bringing a data-driven perspective to a range of the business’s strategic objectives.

6 mistakes to avoid with a data science POC

The 7 Key Steps to a Valuable Data Science POC

Here are the seven essential elements to keeping the project on track for an efficient, effective, and most of all successful data science POC.

1. Choose a real, concrete use case. The first, and possibly most important, step to running a successful POC for data science is choosing a use case. Without this, a POC simply can’t exist. Start with a list of critical business issues from which to choose, possibly soliciting feedback and ideas from teams across the company for a variety of use cases. Evaluate the current processes, whether the use of data science and machine learning techniques could specifically help improve them, and if so, how?

2. Restrict to a reasonable time frame. In general, a maximum of 60 days is sufficient for a data science POC project because it allows for proper evaluation without taking too much time away from staff who are balancing other ongoing work and projects.

3. Clearly define deliverables. Of course, one of the most important factors in restricting a data science POC to a reasonable timeframe is the presence of clear deliverables. Without them, the process can drag on, as no one is really sure what to consider done or what to consider a success (or when).

4. Involve the right people. To run a successful, efficient data science POC project, all relevant stakeholders will need to be involved from all parts of the organization: the data scientists and/or analysts, of course, will necessarily be connected the most to the project; but also the IT team, any business teams involved with or impacted by the results, as well as end users of the solution, should all be involved.

Watch Video

5. Consider Production. Data science and data projects shouldn’t happen in a vacuum, so neither should a data science POC. The goal of a POC isn’t just to complete one simple project. Rather, a successful POC will allow the platform to continue to deliver business value even after the POC is over. In order to deliver that value, projects (including the use case for the data science POC) need to actually go into production and not get stuck in a prototyping or sandbox phase.

6. Ensure autonomy. Often, a data science POC affords companies the opportunity to work with experts in the field with lots of experience in getting data projects off the ground and into production. No matter how simple a product seems, working with experts (likely the product’s sales and/or technical teams) comes with the added advantage of learning from other companies on what works and what doesn’t.

7. Be agile but focused. In the data science POC process, the best results come from teams that are both agile — eager to pivot in new directions they didn’t foresee — but also focused and not straying too far away from the original problem when interesting insights inevitably come up.

Watch Video

What to Expect During a POC With Dataiku

With Dataiku, the work on your data science POC will not be throw-away; we take pride in working with customers to make sure each POC is successful and brings real value. That is part of why we strongly suggest that teams choose a test case that is tied to strategic goals and has real and measurable results. After the data science POC project, you will be left with new learnings and insights, a new model, and/or a new data product (or three!) that you can continue to use even after the POC is done.

An important part of bringing value means ensuring that team members use Dataiku and aren’t afraid of losing work that they put into a project if the POC doesn’t ultimately end in a purchase. Therefore, at the end of a data science POC:

  • Dataiku reverts to the free edition, so projects are still accessible and work is not lost (note, however, that the free edition means no production capabilities and some limited reduction in functions).
  • All of the code used can be extracted by the admin.
  • The GUI models can be seen and still used by the admin.
  • The non-prep recipes can still run on several connections.

Ultimately, adhering to the seven essential components of a successful POC using a centralized platform such as Dataiku will mean a faster decision and an easier transition from POC to implementation.

Why Enterprises Need Data & AI Platforms

Get the Ebook

Finexkap: From Raw Data to Production, 7x Faster

Finexkap’s data team packs a big punch, leveraging Dataiku to build data projects (using both integrated notebooks and visual recipes), automate processes, and push to production 7x faster.

Read more

Go Further

Dataiku for Analytics Leaders

Dataiku offers data leaders the possibility to harness a transparent yet structured environment.

Learn More

Private: Full Elasticity as the Future of AI

There is no question that elasticity, including on-demand compute resource management, is the future of Enterprise AI.

Learn More

GE Aviation: From Data Silos to Self-Service

GE Aviation's self-service system allows them to use real-time data at scale to make better and faster decisions throughout the organization.

Learn More
Watch video

Pfizer: Leveraging Analytics & AI to Scale Initiatives and Achieve Results

Chris Kakkanatt, Data Science Senior Director at Pfizer, speaks about the techniques and elements employed to scale his data team, including reuse and capitalization, and how they can be applied to enterprises worldwide.

Learn More