GenAI & Agents Product Updates

Scale business expertise with trusted AI agents

May 11, 2026

4 min read / Sylvain Ferrandiz

From Shadow AI to Shadow Agents
The Hidden Cost of Autonomy
Why Traditional Governance Breaks Down
Visibility Is the Missing Primitive
Human-in-the-Loop as a Design Choice
The Real Shift: From Limiting AI to Seeing It

A senior procurement manager at a mid-market manufacturer decides which of her suppliers need requalification. Delivery trends. Open quality incidents. Contract renewals coming up. A dozen softer signals nobody has bothered to write down, like which plant manager always overstates a defect and which one underreports. She does this well, for about 200 suppliers. The company has 2,000.

Her judgment is the kind of expertise every enterprise depends on and almost no enterprise has captured. It lives in her head. It scales as far as her calendar allows. When she retires, most of it leaves with her.

This is exactly the work AI agents are supposed to do. Mostly, they don't.

E2A product image

Three attempts, three dead ends

The procurement team starts with one of the big LLM providers. The agent it builds sounds great in a demo and produces plausible recommendations based on nothing — it can't read the supplier database, can't check open POs in SAP, can't pull quality logs from the MES. So the team goes to IT.

Engineering builds a prototype in LangChain. Database access takes six weeks. The first version misses half the decision logic because the engineers don't know procurement processes, and the procurement manager isn't sure how to explain to them why a 12% delivery slip from one supplier is fine and a four percent slip from another isn't. She marks up the output. It goes back into the queue. Three months in, they have a demo that handles a fraction of what she does in a week.

Someone suggests a low-code agent building tool. She can build it herself, it connects to some of the systems, IT blocks it. There’s no way to test the agent's recommendations against last year's decisions before it goes live. For an agent making supplier calls worth millions, that's not a tradeoff — that's the end of the conversation.

Three tools. Three different ways to fail. She's still doing her work manually.

Why this keeps happening

Every enterprise has thousands of processes like this. Claims adjudication. Demand planning. Credit decisioning. Quality inspection. Freight allocation. Each one runs on a few people's judgment, and each one stalls on the same set of problems.

Production systems were built to keep unauthorized software out, and most agent-building tools were built for test data and public APIs. The moment an agent needs to read a real supplier record or query the legacy database, the project stops.

Expert judgment isn't a list of if-then rules. It's a mix of thresholds, exceptions, pattern recognition, judgment calls, and the kind of soft knowledge that only shows up in the margin notes. Capturing it in a prompt produces something shallow. Capturing it in code requires a developer who doesn't know the domain. The expert needs to shape the logic herself, but most tools either oversimplify or shut her out.

And IT can't approve what it can't evaluate. No way to test the agent against real history, no audit trail, no deployment gate — the architecture review takes ten minutes and the answer is a quick no.

Any one of these stops a project. Most enterprises hit all three.

What Dataiku E2A does

Dataiku, the Platform for AI Success, has long focused on the hard part of enterprise AI: making systems that can operate reliably inside the complexity of real organizations. Agents don’t remove that complexity. They amplify it. The challenge is scaling business expertise through agents enterprises can trust to run. Dataiku E2A (Expert-to-Agent) helps organizations do exactly that.

Dataiku E2A is the engine for turning business expertise into trusted AI agents — agents that can operate on real enterprise data, reflect how experienced operators actually make decisions, and be evaluated before they ever go live.

It works because three things sit in one place.

The subject matter expert designs the agent. In a visual interface, the procurement manager shapes how the agent works step by step — when to query the supplier database, when to apply a rule, when to call a forecasting model her team already trusts, when to escalate to a human. Instead of starting from raw code or a blank prompt, she works with governed building blocks managed by IT and AI teams. The result strikes the right balance between precision and ease of use: structured enough to encode the real logic behind her decisions, transparent enough that she can inspect and refine it over time.

The agent reaches production data on day one. Governed connectors link the agent to SAP, the supplier database, the quality system, and whatever else lives across cloud and on-prem environments — through the same secure credentials and access controls IT already runs. No sandboxes, no six-week access queues. The logic she designed runs on the real data, not a demo extract.

The agent gets tested before it ever goes live. This is the piece most teams underestimate. Inside Dataiku, the procurement manager reviews the agent with the AI team against last quarter's actual disruptions — the supplier that went dark in March, the quality issue at the Tier 2 plant, the contract renewal that almost slipped.

They watch the agent's decisions replay step by step, adjust a threshold where it's too conservative, and run it again. When she signs off, she's not approving a black box. She's approving a version of her own judgment that's been stress-tested against situations she's actually lived through. And because the evaluation and deployment controls already sit inside the platform, IT can operationalize the same checks in production instead of rebuilding them from scratch.

From the start, the work runs inside a platform IT already governs — with secure connections, user permissions, audit trails, and deployment controls built into the lifecycle. By the time the agent is ready for production, IT isn't reviewing a workaround. They're reviewing something that already operates within the policies and controls they put in place from day one.

The procurement manager builds the agent. It runs on live data. IT approves it. It handles 2,000 suppliers. She inspects the output weekly and tunes the logic when conditions change. At 3 a.m. on a public holiday, the agent makes the calls she would have made.

From one agent to a system

Vendor requalification is one process at one company. Once it works, the question changes from whether AI can run a process to being which one is next. E2A is built for that. Multi-agent orchestration coordinates workflows where several agents, models, and human checkpoints work together, across an entire ecosystem. Lifecycle management keeps every agent visible, evaluated, and accountable.

This is the shift we think matters. AI as a productivity layer made individuals slightly faster. AI as an operating layer that turns expert judgment into reliable, governed decisions at enterprise scale changes how enterprises run.