Vol. 4, No. 4 — The Infrastructure Decision

Arlo

THE BRIEF

The CFO has entered the room. After three years of AI infrastructure spend flowing through R&D and innovation budgets — where the metrics were loose and the accountability looser — enterprise AI costs are landing on income statements as COGS. Token consumption is now a P&L line item. Forrester reports fewer than one-third of AI decision-makers can tie AI value to P&L changes; only 15% reported an EBITDA lift in the past 12 months. The companies that built AI programs on the implicit assumption that ROI measurement could wait are discovering that the CFO's patience ran out while they were still in pilot.

The ROI data is brutal in its consistency. McKinsey's State of AI survey (November 2025) found only 39% of organizations attributing any EBIT impact to AI — 61% report no EBIT impact at any level. Gartner's survey of 782 infrastructure and operations leaders found only 28% of enterprise AI use cases fully succeed and meet ROI expectations; 20% fail outright. The most common failure driver — cited by 57% of executives reporting AI project failures — is "unrealistic expectations." Not bad models. Not weak infrastructure. Unrealistic expectations, set by people who were not held accountable for delivery.

This creates a specific organizational crisis entering Q4 2026. Boards approved multi-year infrastructure commitments. Average enterprise AI budgets ballooned from $1.2 million to $7 million, with 73% of companies exceeding their AI budget projections. The global enterprise AI spend figure sits at $665 billion this year. Forrester is already predicting that enterprises will defer 25% of planned AI spend into 2027 as ROI scrutiny intensifies. That deferral won't be evenly distributed — it will fall hardest on organizations that cannot produce a measurement framework by Q3.

The infrastructure spending, meanwhile, is not slowing. Worldwide AI spend is projected at $2.59 trillion in 2026, up 47% year-over-year (Gartner, May 2026). Of that, AI infrastructure — hardware, AI-optimized cloud capacity, networking, semiconductors — accounts for over 45% of total spending, roughly $1.17 trillion. The five largest hyperscalers are on pace to deploy $660–700 billion in capital expenditure this year, roughly double 2025 levels, with more than 75% directed at AI-related infrastructure. Alphabet alone raised its 2026 capex guidance to $195–205 billion on its Q2 2026 earnings call (July 22), citing demand acceleration so strong it is bridging supply gaps with third-party capacity in Q3. This is infrastructure investment of a scale that is not reversed by one bad quarterly review cycle.

The compute picture gained new complexity late this month. AMD launched its Helios rack-scale AI platform — directly benchmarking against NVIDIA's Vera Rubin NVL72 — claiming 15% more FP4 compute, 50% more HBM4 memory capacity, and 30% better tokens per dollar. Microsoft Azure confirmed it will deploy Helios, explicitly diversifying beyond NVIDIA at rack scale. This matters because NVIDIA has captured approximately 75% of AI compute revenue (Goldman Sachs, May 2026) and its Q1 FY2027 data center revenue hit $75.2 billion — up 92% year-over-year. A credible challenger with hyperscaler backing changes procurement math for any organization committing to multi-rack AI infrastructure today.

The energy constraint is the story that the infrastructure conversation has been slow to absorb. Goldman Sachs research estimates that agentic AI is 60–130x more energy-intensive than standard chatbot interactions, driven by multi-step reasoning and always-on operation. NVIDIA's Blackwell-era racks draw 120–140 kW each — compared to 35–45 kW for prior-generation H100 racks. Legacy data centers built around 10–15 kW per rack cannot support these densities without full redesigns and liquid cooling retrofits. Microsoft Azure confirmed capacity restrictions in Northern Virginia and Texas due to power grid limitations, with GPUs reportedly sitting in warehouses unused for lack of sufficient power infrastructure. The constraint on enterprise AI scaling in the next 18 months is not the model, not the chip, and not the budget — it is the electrical grid.

The practical architecture shift is already underway. The cloud-first default is breaking at volume. Flexera's 2026 State of the Cloud Report found that 73% of organizations now use hybrid cloud architecture; public cloud as the primary home for production AI inference fell to 41%, with 56% of enterprises now running or planning private cloud for AI inference. The self-hosting break-even for inference workloads sits at roughly 80 million tokens per month — any organization above that threshold running pure API strategies is almost certainly overpaying. Wasted cloud spend rose to 29% in 2026 — the first increase in five years — explicitly linked to surging cloud-based AI workloads. The CFO who hasn't yet reviewed cloud AI spend line items is managing risk they don't know they have.

THE REALITY CHECK

Worldwide AI infrastructure spend will cross $1.17 trillion in 2026, and fewer than three in ten enterprise projects will fully meet their ROI targets — the Gartner survey of 782 infrastructure and operations leaders that opens this issue. The math executives aren't discussing openly: the average enterprise is now spending $7 million annually on AI while the majority report no EBIT impact whatsoever (McKinsey, November 2025). This isn't a technology adoption problem — it's a measurement and accountability problem, and the CFO's arrival at the infrastructure table means the grace period ends this quarter.

THE SIGNAL

The money is moving — but not to where it's creating value

There is a structural tension sitting at the center of every enterprise AI conversation in the second half of 2026: the organizations spending the most on AI infrastructure are not, in aggregate, the ones extracting the most value from it. This is not a lag effect that will self-correct with time. It is an architectural failure that the C-suite can fix — if it understands what's actually broken.

Start with the competitive reality at the provider layer. Google Cloud's Q2 2026 revenue surged 82% year-over-year to $24.8 billion, led by enterprise AI solutions and AI infrastructure. Google Cloud's backlog exceeded $514 billion — rising more than $50 billion sequentially in a single quarter. Microsoft Azure AI's annual recurring revenue surpassed $37 billion, up 123% year-over-year. Amazon Web Services is running at a $150 billion annualized revenue rate with AI services contributing more than $15 billion of that run rate. These are not incremental businesses. This is the fastest infrastructure build in the history of enterprise technology.

The hyperscalers are supply-constrained, not demand-constrained. That distinction matters. Alphabet CFO Anat Ashkenazi confirmed on the Q2 2026 call that Google is using third-party capacity as a bridge in Q3 because its own infrastructure can't keep pace with customer demand. Microsoft has confirmed demand exceeds supply for multiple consecutive quarters, with CFO Amy Hood noting that most GPU and capex spending is contracted for useful life. AWS is guiding $200 billion in capex for 2026 — majority directed at AI and data center infrastructure.

For enterprise buyers, supply constraints at the hyperscaler level have a direct consequence: not all workloads can run at scale in public cloud, and the premise that cloud can absorb unlimited AI demand on demand is no longer true. Organizations that built their 2026 AI infrastructure plans around unlimited cloud elasticity are running into queues, capacity limits, and cost overruns. Wasted cloud spend rose to 29% this year — the first increase in five years — explicitly linked to surging cloud-based AI workloads (Flexera, 2026 State of the Cloud Report).

Now stack the AMD Helios announcement against this landscape. AMD's late July 2026 launch of Helios — a rack-scale AI platform integrating 72 MI455X GPUs and claiming 30% better tokens per dollar versus NVIDIA's Vera Rubin NVL72 — is the most significant competitive development in AI compute infrastructure since Blackwell's introduction. Microsoft Azure, Meta, OpenAI, and Oracle are confirmed early customers. The Helios platform runs on open standards — AMD's ROCm stack and Meta's Open Rack Working Group design — which is a direct play on enterprise procurement teams that have spent the last 18 months accumulating NVIDIA lock-in anxiety.

The competitive framing is this: NVIDIA captures approximately 75% of AI compute revenue. Its Q1 FY2027 data center revenue of $75.2 billion was up 92% year-over-year. But Goldman Sachs projects $765 billion in annual AI infrastructure capex for 2026, rising to roughly $1.6 trillion annually by 2031. The total cumulative investment over 2026–2031 exceeds $7.6 trillion. At those numbers, even modest market share movement represents hundreds of billions in revenue. AMD doesn't need to beat NVIDIA to change enterprise procurement dynamics — it only needs to provide credible, benchmarked competition on the specific metric that CFOs care about: tokens per dollar.

The revenue implication for enterprise buyers is direct. Organizations committing to large GPU infrastructure purchases in Q3 2026 without evaluating AMD Helios are potentially locking in a cost structure that will look expensive by mid-2027. The specific claim — 30% better tokens per dollar — requires independent validation, but it is now backed by hyperscaler adoption, which is the validation that procurement teams use when they can't run their own benchmarks. Microsoft doesn't deploy AMD Helios across Azure data centers because of a press release.

The market share battle is unfolding simultaneously at the model layer. Closed models — OpenAI, Anthropic, Google — captured the dominant share of enterprise AI spend in 2026, driven by context lock-in and production reliability; practitioners estimate roughly 89% of enterprise production spend flows to closed APIs versus approximately 11% to open-source models, though a primary survey confirming the precise split has not been independently verified. But the inference cost collapse — prices for equivalent performance have fallen roughly 90–97% since 2022–2023 (Gartner, March 2026) — is changing the calculus. When closed model inference costs $0.40 per million tokens versus $20 three years ago, the price advantage of open-source self-hosting narrows. What doesn't narrow is the operational overhead: Mozilla CTO Raffi Krikorian articulated this precisely — the "real tax" of open-source models isn't compute, it's evals, observability, on-call, compliance, and safety work that closed APIs bundle. That ops burden is the hidden line item in build-vs-buy analysis that most enterprise TCO models still undercount.

The timeline is concrete. Forrester is predicting that enterprises will defer 25% of planned AI spend into 2027. That deferral is not an industry-wide event — it will sort between organizations that have ROI accountability structures and those that don't. The companies with measurement frameworks and P&L-connected AI use cases will accelerate; the ones running expensive pilots against unmeasured outcomes will face board-level challenges to continued investment. The executives who use Q3 2026 to build measurement infrastructure are protecting their AI programs from Q4 budget scrutiny. The ones who don't may find that the CFO's arrival at the infrastructure table is accompanied by a significant haircut.

THE DEEP DIVE

Thesis: The enterprise AI stack has a decisive failure point — and it isn't the model.

Every post-mortem on a failed enterprise AI deployment reaches the same conclusion. The model performed adequately in the sandbox. The model failed in production — not because it changed, but because everything around it did. The orchestration layer broke. The legacy ERP integration blocked the agent at a permissions wall. The prompt context exploded. The vendor routing was hardcoded to a single provider that had a five-minute outage. The enterprise shipped a model and called it a product. That is the failure mode.

The data is consistent across sources at every confidence level. Gartner found that 28% of enterprise AI use cases fully succeed; 20% fail outright. McKinsey's State of AI survey found 61% of organizations reporting no EBIT impact at any level. Practitioners tracking agentic deployments report that 29% of AI agents are abandoned within 90 days — not because the model was wrong, but because the surrounding stack was: no defined success metric before deployment, agents unable to reach legacy systems due to permissions or compliance blockers, and the "pilot-to-production cliff" where success rates fall from roughly 60% in clean sandbox environments to roughly 25% when connected to real enterprise systems. Over 40% of abandoned agents had no defined success metric before they were deployed. Those aren't model failures. Those are governance failures at the stack layer.

The enterprise AI technical stack in 2026 has four distinct layers, and the failure modes are concentrated in the middle two. The first layer — frontier cloud APIs for the hardest use cases — is largely solved. The fourth layer — RAG retrieval grounding the model in proprietary data — is increasingly understood; RAG appears in 51% of enterprise production AI deployments versus just 9% for fine-tuning alone, making it the dominant technical pattern for enterprise LLM deployment. The technology at these layers is mature enough that failure here is a procurement or implementation error, not an architectural one.

The second and third layers are where enterprises are leaving money on the table and accumulating technical debt simultaneously.

Layer 2: The orchestration harness. This is the routing, caching, governance, and fallback logic that sits between your application and the model provider. Research across 22 enterprise tasks and six models (practitioner analysis, @IntuitMachine, July 2026) found that orchestration optimization alone — prompt caching, history compaction, context offload — cut costs 33–61%, reduced latency 44%, and reduced tokens per task by 38%. That range is not a marginal improvement; it is the difference between an AI infrastructure cost structure that is justifiable at board level and one that isn't. The enterprises with sophisticated orchestration harnesses are running the same underlying models at a fraction of the cost of the enterprises without them.

The orchestration layer is also where model routing decisions live. Despite closed models capturing the dominant share of enterprise AI spend, enterprises with model-agnostic orchestration layers are increasingly routing routine tasks to cheaper or self-hosted models and reserving frontier API calls for the highest-complexity work. Practitioners describe the pattern as using on-premise or local inference for 80% of routine tasks and routing the hardest 20% to frontier cloud models. Box CEO Aaron Levie has framed model routing as a CFO-level concern, not just an engineering decision. He is right: the inference token cost structure at enterprise volume makes routing decisions worth millions of dollars annually.

Layer 3: The integration and governance layer. This is where the most expensive failures happen, and it is the layer enterprise architecture most consistently underinvests. The constraint cited in 33% of failed agentic deployments is agents unable to reach legacy systems due to permissions or compliance blockers. The infrastructure investment in Layer 3 looks unglamorous from a board deck — it is mostly API wrappers, auth configuration, access controls, and observability tooling. The MLflow ecosystem has reached 55%+ adoption for experiment tracking; specialized LLMOps tools like Langfuse, LangSmith, and Arize are standardizing observability. But Gartner's research suggests that fewer than one-third of AI decision-makers can tie AI value to P&L changes (Forrester, 2026), which means most organizations do not have the observability infrastructure to know when Layer 3 is the problem.

The framework for C-suite decision-making here is not "what model should we use?" That is an engineering question with a six-month shelf life. The executive question is: what are we investing in that will survive the next model generation?

The answer has three components. First: build the orchestration harness before picking the model. A model-agnostic routing layer is an infrastructure asset; a hardcoded OpenAI integration is a liability that requires re-engineering every time a better or cheaper model becomes available. Second: treat Layer 3 investment as a prerequisite for Layer 2 returns. The orchestration layer can only extract value if the underlying data connections, permissions structures, and observability stack are functional. Third: measure everything that touches P&L before deploying at scale. The 41% of abandoned agentic deployments that had no defined success metric before deployment is not a technology failure — it is a governance failure that was preventable at the planning stage.

The failure mode that is accumulating fastest: organizations that respond to the ROI crisis by cutting the AI budget rather than redesigning the accountability structure. Forrester predicts 25% of planned AI spend will be deferred into 2027. Some of that deferral is rational; it will fall on use cases that should never have been funded. But organizations that cut without first building measurement infrastructure will find themselves in Q1 2027 unable to distinguish the valuable AI programs from the expensive demos. The consequence is not just wasted money — it is the competitive disadvantage of rebuilding an AI capability from scratch while peers who maintained investment pull further ahead.

The power and energy constraint adds a dimension to this framework that most enterprise stack discussions have not yet incorporated. Goldman Sachs research estimates that agentic AI is 60–130x more energy-intensive than standard chatbot interactions. NVIDIA Blackwell-era racks draw 120–140 kW each — 8–10x higher than prior H100 racks at 35–45 kW. Legacy data centers built around 10–15 kW per rack are architecturally incompatible with these densities; full redesigns with liquid cooling are required at scale. Global data center power demand is projected to exceed 1,000 TWh by 2026 (IEA data). Organizations planning meaningful on-premise AI deployments are making real estate and energy procurement decisions, not just hardware purchases.

The consequence is direct. Any CIO who has built a 2026 AI infrastructure roadmap without a power and cooling inventory of existing data center capacity has an incomplete plan. Site selection for new AI data center capacity now runs 4–7 years on grid interconnection queues in major markets. Enterprises that move on this constraint in Q3 2026 have options; enterprises that wait until 2027 may not. The AMD Helios positioning — claiming 50% more memory capacity and 30% better tokens per dollar while running on open standards — is partly about performance and partly about energy efficiency at rack scale. As power cost becomes a meaningful variable in infrastructure TCO, tokens per watt will join tokens per dollar as a procurement metric that matters.

The enterprise AI stack conversation in 2026 has been dominated by model selection. The competitive advantage is being built at the layers nobody is presenting at board meetings.

THE PLAYBOOK

C-Suite

Commission a P&L mapping of all active AI programs by September 30 because Forrester projects 25% of planned AI spend will be deferred into 2027 and you need a measurement framework before the board asks for one — not after.
Require every AI initiative to name a business metric it owns because Gartner found 57% of AI project failures are attributed to "unrealistic expectations" — which is a governance failure that originates at the executive approval stage, not the engineering stage.
Put the infrastructure economics question on the Q4 agenda because the self-hosting break-even for AI inference now sits at roughly 80 million tokens per month and most enterprise cloud contracts were signed before this math changed.

CMO/VP Marketing

Audit which marketing AI deployments are using frontier API calls for tasks that a routed smaller model could handle because 80% of enterprise AI GPU compute spending is now inference-focused, and token cost optimization at marketing workload volumes is a seven-figure annual decision.
Map every personalization and content AI workflow to a defined conversion or retention metric before Q4 planning because Forrester reports fewer than one-third of AI decision-makers can tie AI value to P&L changes, and marketing organizations that cannot make this case will face disproportionate budget scrutiny.
Evaluate whether your marketing data is structured for RAG retrieval because RAG appears in 51% of enterprise production AI deployments and organizations without clean, retrievable proprietary data are getting commodity outputs regardless of what model they run.

CIO/CTO

Evaluate AMD Helios against your NVIDIA procurement roadmap before any rack-scale commitments because Microsoft Azure, Meta, OpenAI, and Oracle are confirmed Helios early deployers, and AMD's claimed 30% tokens-per-dollar advantage over NVIDIA's Vera Rubin NVL72 is now a hyperscaler-validated data point you owe your procurement process.
Inventory data center power capacity and cooling infrastructure before expanding on-premise AI because NVIDIA Blackwell racks draw 120–140 kW each versus 35–45 kW for prior-generation hardware, rendering legacy 10–15 kW/rack environments architecturally incompatible without redesign.
Build the model-routing orchestration layer before committing to a single model provider because practitioner analysis (@IntuitMachine, July 2026) across 22 tasks and six models found orchestration optimization alone reduced costs 33–61% and tokens per task 38% — and a provider-agnostic architecture survives every model generation change that a hardcoded integration does not.

Department Leads / AI Initiative Owners

Define the success metric for every agentic deployment before it ships to production because 41% of abandoned AI agents had no defined success metric before deployment, and the 29% abandonment rate within 90 days is almost entirely a governance failure, not a model failure.
Document legacy system integration dependencies before building orchestration logic because 33% of failed agent deployments trace to agents unable to reach target systems due to permissions or compliance blockers — a problem that is trivially preventable in planning and expensive to diagnose in production.
Build LLMOps observability before scale, not after because MLflow (55%+ adoption) and specialized tools like Langfuse and LangSmith are now standard, and organizations that cannot distinguish a model performance issue from an orchestration failure in production are debugging expensive problems with no instruments.

THE NUMBERS

$2.59 trillion

Projected worldwide AI spending in 2026, up 47% year-over-year. (Gartner, May 2026)

$1.17 trillion

AI infrastructure's share of that total (45%+), led by AI-optimized servers, IaaS, networking, and semiconductors. (Gartner, May 2026)

$660–700 billion

Combined capex guidance of the five largest hyperscalers (Amazon, Microsoft, Alphabet, Meta, Oracle) for 2026 — roughly double 2025 levels, with 75%+ directed at AI-related infrastructure. (Futurumgroup, Goldman Sachs, company earnings, 2026)

$75.2 billion

NVIDIA data center revenue in Q1 FY2027 (ended April 2026), up 92% year-over-year. (NVIDIA Q1 FY2027 earnings, May 2026)

82%

Google Cloud revenue growth year-over-year in Q2 2026, with cloud backlog exceeding $514 billion. (Alphabet Q2 2026 earnings release, July 22, 2026)

$37 billion+

Microsoft Azure AI annual recurring revenue, up 123% year-over-year. (Microsoft FY2026 Q2 earnings)

28%

Share of enterprise AI use cases that fully succeed and meet ROI expectations. (Gartner survey of 782 infrastructure and operations leaders, April 2026)

73%

Enterprises using hybrid cloud architecture in 2026. Public cloud as the primary home for AI inference fell to 41%. (Flexera, 2026 State of the Cloud Report)

29%

Wasted cloud spend in 2026 — first increase in five years, explicitly linked to AI workloads. (Flexera, 2026 State of the Cloud Report)

60–130x

Goldman Sachs estimate of how much more energy-intensive agentic AI is vs. standard chatbot interactions, driven by multi-step reasoning and always-on operation. (Goldman Sachs, reported by Fortune, May 2026)

~80 million tokens/month

Current self-hosting break-even threshold for AI inference. Organizations consistently above this volume are likely overpaying with pure API strategies. (Practitioner analysis, multiple sources, 2026)

90–97%

Decline in LLM inference prices for equivalent performance since 2022–2023. (Gartner, March 2026; Epoch AI inference price trend data)

TAKEAWAY: Enterprises are collectively spending $665 billion on AI this year while 61% report no EBIT impact — not because the technology doesn't work, but because they built the model before they built the measurement.

(McKinsey State of AI, November 2025; Gartner, April 2026)

WHAT'S NEXT + WHAT'S COMING

The orchestration layer is becoming the moat. The signal gaining the most cross-channel momentum — across X practitioner threads, enterprise architecture discussions, Gartner research, and deployment post-mortems — is the rapid maturation of the model routing/orchestration layer as the structural differentiator in enterprise AI. Specifically: MCP (Model Context Protocol) and A2A (Agent-to-Agent) protocols are moving from experimental tooling toward enterprise-grade infrastructure standards. Provider-agnostic orchestration middleware — the layer that routes tasks to the appropriate model, manages context, and enforces governance — is the category that hasn't hit mainstream executive coverage yet but is producing the most visible competitive separation at the practitioner level. The enterprises building this layer now, before it becomes table stakes, are positioning for a 2027 market where switching models is as routine as switching cloud storage tiers. Watch for enterprise software companies (not AI labs) announcing model-routing or orchestration middleware products in Q3 2026. That is the emerging category signal. (Trend Spotter, Arlo Vol. 4 No. 4, July 2026)

One thing to watch before next Tuesday: Amazon Q2 2026 earnings (reporting July 30–31). AWS is running at a $150 billion annualized revenue rate with AI services exceeding $15 billion run rate. Q2 data will show whether the supply-constraint narrative confirmed by Google and Microsoft also holds at Amazon — and whether Andy Jassy updates 2026 capex guidance from the current $200 billion baseline.

M&A and structural moves:

Meta and Anthropic are in preliminary talks for a deal valued at up to $10 billion over two years, in which Meta would lease AI compute capacity from its own data centers to Anthropic (NYT, CNBC, Reuters, July 17, 2026) — preliminary and unconfirmed as of July 29. If it closes, Meta becomes a compute provider and the hyperscaler competitive structure changes.
AMD confirmed Microsoft Azure, Meta, OpenAI, and Oracle as Helios early deployers at its "Advancing AI 2026" event (late July 2026) — meaningful because it signals hyperscalers are actively diversifying beyond NVIDIA at rack scale.
NVIDIA Vera Rubin NVL72 is in expanding global production deployment, with CoreWeave benchmarks showing 10× performance per megawatt versus prior-generation Grace Blackwell NVL72 hardware.

Upcoming events to watch:

Amazon Q2 2026 earnings: July 30–31, 2026
Meta Q2 2026 earnings: July 29, 2026 (potential venue for compute deal updates)
Gartner Data & Analytics Summit: monitor for updated enterprise AI deployment data, particularly on agentic AI project cancellation rates (currently projected at 40%+ by end of 2027)
Any Q3 enterprise software announcements on model-routing or MCP/A2A middleware products — this is the category to track

WRITER NOTES:

Confidence flags (QA double-check):

[M14] Orchestration 33–61% cost savings / 44% latency reduction / 38% token reduction: Source is @IntuitMachine practitioner analysis on X citing "22 enterprise tasks, six models" — no published white paper or methodology document was found. Strong directional signal but I have flagged it as a practitioner claim. Used in The Deep Dive with attribution to practitioner analysis. QA should attempt to find the underlying study or add explicit uncertainty framing.
[M15] 89% closed model / 11% open-source enterprise spend split: Source is @MilkRoadAI on X; primary survey not identified. I used this figure with hedged language in The Signal and as directional context. QA should attempt independent corroboration or downgrade language to "practitioners estimate."
Goldman Sachs 60–130x energy intensity for agentic AI: Reported via Fortune secondarily; original Goldman research note not directly verified. Plausible given token economics. Used with Fortune attribution.
[H11] Enterprise AI budget balloon from $1.2M to $7M: Primary sourcing for the budget balloon figure is secondary; Gartner and Flexera corroborate spend escalation direction. Used in The Brief with directional framing.

Sections needing additional sourcing:

The Playbook — the 80M token/month self-hosting break-even threshold is drawn from practitioner analysis in Trend Spotter Signal 6. It is not from a named independent research firm. If QA can identify a primary source (Epoch AI, Gartner) that validates the threshold, update the citation.
The Deep Dive — "29% of AI agents abandoned within 90 days" and "41% had no success metric" figures: drawn from Trend Spotter Signal 5 practitioner sources (@DMVG_JTK, @johniosifov, @0xCodila). Directionally consistent with Gartner and Forrester data but QA should attempt to identify a primary survey source.

Research gaps incorporated:

Amazon Q2 2026 earnings gap: Amazon's Q2 2026 earnings (July 30–31) were not available at research compilation. Q1 2026 data ($37.6B AWS revenue, $150B ARR, $15B AI run rate, $200B capex guidance) was used throughout. Q2 data flagged explicitly in What's Next as the primary near-term watch item.
Microsoft FY2026 Q4 earnings: Released July 29, 2026 — same day as this issue. Full transcript not available. Q2 FY2026 data used (ARR $37B, Azure +39%, 123% YoY AI ARR growth).
Open-source vs. closed model enterprise spend (89%/11%): Used directionally with attribution disclosure; primary survey source not identified. Note this in QA.
On-premise LLM market share (~60%): Not cited as a specific figure in body copy per QA guidance. Cited directionally as "the cloud-first default is breaking at volume" with Flexera hybrid cloud data as the anchor.
Data quality as root cause of AI failures: Excluded — source (techment.com) did not trace to an identifiable primary study.
$500M single-month AI bill: Excluded per QA guidance (anecdotal, no attribution to named organization).
Industry-specific breakdown (financial services, healthcare, manufacturing): Not covered in source materials; noted as a future research priority for a sector-specific edition.

This report was produced with AI assistance and human editorial review.

Vol. 04, No. 04 · July 2026 · Confidential – Subscriber Use Only

arlobriefing.ai

Vol. 4, No. 4 — The Infrastructure Decision

THE BRIEF

THE REALITY CHECK

THE SIGNAL

THE DEEP DIVE

THE PLAYBOOK

THE NUMBERS

WHAT'S NEXT + WHAT'S COMING

Recommended for you

Explore

Your Account

Socials