AI Infrastructure.
The physical layer of artificial intelligence — servers, networking, power, and cooling. Every AI model runs on real hardware consuming real electricity in real buildings. We analyze the companies building this infrastructure at the intersection of compute demand and physical-world constraints.
Our Thesis
Artificial intelligence is the most capital-intensive technology buildout in human history. The combined hyperscaler capex from Microsoft, Alphabet, Amazon, and Meta exceeded $200 billion in 2025, with commitments to sustain or increase spending through 2027. Every dollar of this capital expenditure flows through a physical supply chain: servers manufactured by Dell and Supermicro, networking equipment from Arista and Broadcom, power infrastructure from Vertiv and Eaton, cooling systems from Schneider Electric and Modine, and data center shells built by specialized construction firms. This is not speculative — the purchase orders are signed, the fabs are running, and the construction crews are pouring concrete.
Our research focuses on a critical insight: while the ultimate returns from AI models are uncertain, the infrastructure to build and run them is being purchased regardless. NVIDIA has already shipped the GPUs. Arista is already booking the switch orders. Vertiv is already manufacturing the power distribution and cooling units. The question for AI infrastructure investors is not whether AI will work — it is whether the current capex cycle is sustainable, which companies capture the highest-margin positions in the supply chain, and what happens to these stocks when capex inevitably moderates or pauses.
The most underappreciated constraint in AI infrastructure is power. A modern GPU cluster running NVIDIA B200 GPUs consumes 70-120kW per rack, compared to 10-15kW for a traditional enterprise server rack. This 7-10x increase in power density is creating a genuine infrastructure crisis: utilities cannot deliver enough power to data center campuses, grid interconnection queues stretch 3-5 years, and some hyperscalers are exploring nuclear power, fuel cells, and on-site generation to solve the bottleneck. Companies positioned at this intersection of AI compute demand and power infrastructure constraint represent the most compelling investment opportunities in the sector.
Research Framework
GPU Server Economics
- •NVIDIA allocation dynamics: H100 → B100 → B200 generation transitions and pricing
- •Server configuration economics — 8-GPU vs. 72-GPU rack-scale systems, NVLink domain sizing
- •Total cost of ownership: capital cost, power, cooling, maintenance, 3-5 year refresh cycles
- •Custom silicon competition: Google TPU v5p, Amazon Trainium2, Microsoft Maia — threat to GPU monopoly
- •Inference vs. training hardware divergence: different optimization targets creating market segmentation
Data Center Networking
- •Fabric architecture: InfiniBand vs. Ultra Ethernet for GPU cluster interconnect
- •Optical transceiver demand: 800G ramping, 1.6T in qualification, co-packaged optics on roadmap
- •Switch ASIC economics: Broadcom Tomahawk/Jericho, Marvell, and custom hyperscaler silicon
- •Network topology optimization: fat-tree, dragonfly, rail-optimized for training workloads
- •East-west bandwidth scaling: traffic patterns within AI clusters vs. traditional north-south
Power Infrastructure
- •Data center power demand forecasting — AI-driven growth inflection beyond historical trend lines
- •Grid interconnection constraints: 3-5 year queue backlogs, utility infrastructure investment cycles
- •On-site generation: natural gas turbines, fuel cells (Bloom Energy), small modular reactors
- •Power distribution: UPS systems, switchgear, busway, PDU density at 70-120kW per rack
- •Battery energy storage systems (BESS): backup power and grid services at data center scale
Cooling & Thermal Management
- •Liquid cooling transition: direct-to-chip, immersion cooling, rear-door heat exchangers
- •Coolant distribution unit (CDU) market: Vertiv, CoolIT, GRC competitive landscape
- •PUE (Power Usage Effectiveness) optimization at high-density rack configurations
- •Water consumption analysis — evaporative cooling constraints in water-scarce regions
- •Thermal management supply chain: heat sinks, cold plates, pumps, manifolds, facility piping
How We Analyze AI Infrastructure Demand
Step 1: Capex Commitment Tracking
We start with the source of demand: hyperscaler capital expenditure commitments. Microsoft, Alphabet, Amazon, and Meta collectively guided to over $200 billion in 2025 capex, with a significant and growing portion allocated to AI infrastructure. We decompose these commitments into categories: land and buildings (long lead time, construction firms), power infrastructure (utilities, switchgear, backup generation), networking equipment (switches, transceivers, cabling), and compute (GPU servers, storage, custom silicon). Each category has different timing, different suppliers, and different margin profiles. We track quarterly capex actuals against guidance, leading indicators in construction permits and equipment orders, and forward commitments in earnings calls and investor days to maintain a rolling 12-24 month demand outlook.
Step 2: Supply Chain Bottleneck Mapping
Every supply chain has binding constraints, and in AI infrastructure, these constraints shift over time. In 2023-2024, the primary bottleneck was GPU supply (NVIDIA allocation). In 2025, the constraints have migrated to advanced packaging (TSMC CoWoS capacity), high-bandwidth memory (HBM3E from SK Hynix and Samsung), networking (800G optical transceivers), and increasingly, power infrastructure (grid interconnection, transformer lead times, switchgear availability). We map the full supply chain, identify the current binding constraint, and invest in companies positioned at or adjacent to the bottleneck. Bottleneck positions command pricing power; commodity positions do not. When the bottleneck shifts, so does the investment thesis.
Step 3: Unit Economics Analysis
We model the full cost structure of AI infrastructure at the rack level, facility level, and workload level. A single DGX B200 system costs approximately $200-300K, consumes 10-14kW, and requires liquid cooling. A 100MW data center campus represents $2-4 billion in total investment (land, building, power, cooling, compute, networking) and can host roughly 5,000-8,000 GPU servers depending on rack density. At the workload level, we model the cost per token for inference, cost per training FLOP, and the implied revenue requirements for hyperscalers to earn their cost of capital on this infrastructure. If the total installed AI infrastructure base costs $500 billion but generates only $100 billion in AI-attributable revenue, the return on investment is insufficient — and future capex will decelerate regardless of technological promise.
Step 4: Power Constraint Severity Assessment
Power is the most binding long-term constraint on AI infrastructure buildout. We analyze this at multiple levels: macro (total U.S. data center power demand vs. grid capacity additions), regional (Northern Virginia, Dallas, Phoenix, and other data center hubs approaching utility capacity limits), and facility (individual campus power procurement agreements, utility tariff structures, renewable energy PPA terms). The grid interconnection queue now exceeds 2,600 GW of proposed projects, with average approval timelines of 3-5 years. Utility capex plans, transmission line construction schedules, and generation capacity additions are leading indicators of where data center construction can and cannot proceed. Companies that solve the power bottleneck — through on-site generation, nuclear partnerships, or innovative grid solutions — capture strategic value that extends well beyond their immediate product sales.
Step 5: Cycle Risk & Valuation
AI infrastructure stocks carry significant cycle risk. The history of technology infrastructure buildouts — fiber optics in 2000, cloud computing in 2018, 5G in 2021 — shows that capex cycles overshoot, pause, and then resume at a more sustainable rate. We model three scenarios: sustained acceleration (hyperscalers increase capex further as AI revenue materializes), plateau (capex remains flat as the current infrastructure digests), and correction (capex pulls back 20-30% as ROI disappoints). Each scenario produces dramatically different outcomes for infrastructure suppliers. Our valuation framework incorporates cycle-adjusted earnings, through-cycle FCF yields, and explicit capex sensitivity analysis. The goal is to own the right companies at prices that offer acceptable returns even in the correction scenario — and exceptional returns if the acceleration continues.
Coverage Universe
- Dell Technologies (DELL)
- Super Micro Computer (SMCI)
- Hewlett Packard Enterprise (HPE)
- Applied Digital (APLD)
- CoreWeave (CRWV)
- Oracle (ORCL)
- Arista Networks (ANET)
- Broadcom (AVGO)
- Ciena (CIEN)
- Juniper Networks (JNPR)
- Cloudflare (NET)
- Cisco Systems (CSCO)
- Vertiv Holdings (VRT)
- Eaton Corporation (ETN)
- Powell Industries (POWL)
- Bloom Energy (BE)
- Generac (GNRC)
- Schneider Electric (SBGSF)
- Modine Manufacturing (MOD)
- nVent Electric (NVT)
- Regal Rexnord (RRX)
- Johnson Controls (JCI)
- Trane Technologies (TT)
- Datadog (DDOG)
- Cloudflare (NET)
- MongoDB (MDB)
- Confluent (CFLT)
- Elastic (ESTC)
- AIPI (AI Infrastructure)
- BOTZ (Robotics & AI)
- QTUM (Quantum/AI)
- IRBO (iShares R&AI)
- IGV (Software)
Current Themes
Hyperscaler Capex Sustain
The central question for AI infrastructure investors is whether current capex levels are sustainable or represent a peak. Microsoft guided $80B+ for FY2025, Meta $60-65B, Alphabet $75B, and Amazon $100B+. If these companies see measurable ROI from AI products — Copilot revenue lift, AI-powered ad targeting improvements, AWS AI service adoption — capex sustains or accelerates. If ROI disappoints, capex moderates. We track leading indicators of AI revenue realization at each hyperscaler to gauge the sustainability of the infrastructure buildout.
Edge Inference Buildout
While training workloads concentrate in massive centralized data centers, inference is migrating to the edge. On-device inference (phones, PCs, cars), enterprise inference (private clouds, on-premise servers), and telecom edge inference create demand for a different class of infrastructure — smaller, distributed, optimized for latency rather than throughput. Companies like Dell (enterprise servers), Cloudflare (edge network), and Applied Digital (distributed GPU cloud) are positioned for this second wave of AI infrastructure demand that follows the centralized training buildout.
Power Bottleneck
Data center power demand in the U.S. is projected to double from 17 GW in 2022 to 35+ GW by 2030, driven almost entirely by AI workloads. Utilities cannot build generation and transmission infrastructure at this pace — permitting alone takes 3-5 years for new power plants and transmission lines. This creates an infrastructure bottleneck that benefits companies in power equipment (Vertiv, Powell Industries, Eaton), on-site generation (Bloom Energy fuel cells, small modular nuclear), and grid modernization (transformers, switchgear, smart grid). The power constraint is the single most important variable in AI infrastructure investing over the next five years.
Sovereign AI Clouds
Governments worldwide are mandating that AI infrastructure for sensitive applications be hosted within national borders, on nationally controlled hardware, under domestic data sovereignty laws. This drives demand for AI infrastructure outside the U.S. hyperscaler footprint — local data centers, nationally sourced power, and sometimes locally manufactured hardware. Countries from Saudi Arabia to France to India are investing billions in sovereign AI infrastructure. Companies that can deliver turnkey AI data center solutions internationally capture a market that barely existed 18 months ago.
Interested in our AI infrastructure research?
All inquiries are treated with strict confidentiality.
Investor Inquiries