№ 04MAY 4, 20267 MIN READ

What Are We Actually Funding When We Fund 'AI'?

We are measuring something we never properly defined.

We keep debating AI ROI as if we know what we're measuring. We don't.

'AI initiative' has quietly become one of those terms that means whatever each conversation needs it to mean. The more I sit with that, the stranger it seems. The phrase gets used as a budget line, a strategic theme, a board update, a vendor pitch. The portfolio gets reviewed, the spend gets defended, the ROI question gets asked. But the unit of analysis - what specifically is being funded, what specifically is being measured, what specifically is supposed to deliver value - is rarely defined with any precision. The vibe is doing the work the definition should be doing.

What sits underneath that, I think, is something the rest of the conversation quietly depends on. Enterprises are funding 'AI initiatives' without a stable definition of the object they're funding, and every measurement debate downstream stays a little incoherent until the unit gets fixed. "AI initiative" tends to mean whatever each conversation needs it to mean. Sometimes it points at a contained automation use case. Sometimes at a cross-functional decision-support program. Sometimes at a multi-year transformation budget. Sometimes at a vendor platform contract. Those are not the same kind of object. They produce value through different mechanisms, fail through different mechanisms, and resist being measured with the same instrument.

What I find quietly odd about the AI ROI debate is that it skips past all of that. The familiar statistics - e.g., most pilots fail, only a quarter deliver expected returns, few executives can measure ROI confidently - get cited as if 'AI initiative' were a stable unit and the only open question were how to measure it more rigorously. The measurement debate seems to be answering a question that hasn't been properly asked yet.

When I push for precision on the unit, two things tend to surface fairly quickly:

(i) The first is that AI use cases sit in different positions on a matrix I have written about before - the role of AI in the decision (automate, support, make) by the scope of the decision (single-function, cross-functional). The bottom-left is contained automation. The top-right is cross-functional decision-making. These do not feel like points on a continuum. They feel like different value structures, and the more I look at how they fail, the more each one seems to fail for its own reasons.

(ii) The second is that current measurement frameworks - the traditional stage-gate models that walk use cases through technical, adoption, operational, and financial layers - were largely built for the bottom-left.

For an invoice-processing AI, or a routing model, or a meeting transcription tool, those frameworks tend to fit. The use case is contained, the value is local, the measurement is local, and the instrument matches what is being measured.

For the top-right, the same instrument seems to lose what matters. Cross-functional AI decisions produce value through coordination across functions, not within them. The value lives in how decisions interact, in which trade-offs get resolved explicitly versus implicitly, in whether downstream functions are quietly absorbing variance from upstream models. Not much of that registers inside a use-case-level frame. Stage-gating the wrong unit produces something that looks like measurement and isn't quite - disciplined instrumentation pointed at an object that doesn't carry the value.

When I read the headline AI ROI statistics with that distinction in mind, they look less like a verdict on AI overall and more like a verdict on a specific part of the portfolio. The underperforming work isn't the bottom-left automation. It tends to be the cross-functional initiatives that were positioned and budgeted as transformation, and that are then measured with frameworks built for automation.

The most common response to that pattern, particularly in thought leadership, is what I have started thinking of as the deferral move: "It's still early days. Many AI deployments are still experimental. The real value will come with deeper integration into core workflows over time."

What I find unconvincing about that response is that it treats the gap as a maturity problem time will resolve. The more I look at it, the less it feels like a timing issue and the more it feels like a sequencing one. Deeper integration without architecture seems to produce more drift, not less, because each new cross-functional AI deployment compounds the coordination debt the previous one created. The longer integration runs without an architecture beneath it, the more expensive the eventual redesign becomes.

What I keep landing on, almost reluctantly, is that the inversion the conversation needs is one of order rather than of method.

Most enterprises seem to be funding AI use cases in the hope that value will materialize. The value seems unlikely to materialize on those terms, because the architecture that would let it materialize hasn't been designed. Architecture-first, in the way I have been thinking about it, would mean a few things became explicit before any cross-functional AI shipped: the cross-functional decisions themselves, the trade-off rules that govern them, the decision rights for each one, the constraints that get embedded into the systems that execute them, and the architecture-level measurement instruments that sit alongside use-case metrics rather than being replaced by them.

In that frame, use cases would become implementations of an architecture rather than independent investments looking for one. Measurement would happen at two levels: (i) does the architecture work as designed, and (ii) does each use case respect it, and use-case measurement alone would seem insufficient by definition.

The order seems to be doing more work than the method. With the order wrong, every measurement framework - current ones, future ones, the next gen's one - keeps producing partial answers. With the order right, the questions executives have been asking begin to have somewhere to land.

We are measuring something we never properly defined. Until that changes, the rest of the conversation will continue to look coherent - but isn't.

What Are We Actually Funding When We Fund 'AI'?

Subscribe