Computing

FOMO is causing people to buy too many GPUs, leaving 95% of their capacity unused.

A report from Cast AI, a worldwide automation platform for cloud-native and AI workloads, says that companies are putting billions of dollars into AI infrastructure that isn’t being used much.
The analysis looked at data from 23,000 Kubernetes clusters and concluded that the average GPU usage on corporate systems is under 5%. That means that 95% of the GPU capacity that was set up is not being used.

The paper said that a CPU core that isn’t doing anything costs pennies per hour, whereas a GPU that isn’t doing anything costs thousands. Prices for GPUs are going up for the first time since EC2 came out in 2006. AWS hiked the cost of H200 Capacity Blocks by 15% in January 2026, saying that it was because of supply and demand. The rise goes against a trend in prices that has been going on for 20 years.

The research said that at these prices, the need to hoard makes sense. Long lead times make it feel riskier to release capacity that can’t be recovered than to pay too much. But at 5% use, the math doesn’t add up, and hoarding makes the scarcity cycle that forces prices up.

Laurent Gil, president of Cast AI, told TechNewsWorld, “This shocked us and our customers.” “Very few people knew they weren’t using those machines well.”
Fear of Being Compute-Strained “Your goals have to be pretty big to buy too many GPUs,” said Alvin Nguyen, a senior analyst for infrastructure outsourcing, data center services, and semiconductor research at Forrester Research, a global market research company based in Cambridge, Massachusetts.

He told TechNewsWorld, “Unless you’re a hyperscaler, neocloud, or AI startup, you probably don’t have the use cases to justify the overpurchase of GPUs.”

Dan Herbatschek, the CEO and founder of Ramsey Theory Group, a technology holding and innovation company based in New York City, said that companies are over-indexing on capacity because they are expecting AI use cases that haven’t been put into action yet.

He told TechNewsWorld that “C-suites are afraid of being compute-strained when agentic AI systems go live.” “We work with big companies, and they acquire things before they need them. But it’s hard to justify that investment right now, when most organizations don’t have any use cases that are ready for production.

He went on, “The last time I saw this was with the cloud.” “We’re definitely in a bubble moment for AI capacity. Leaders are forgetting that it’s not about who has the most computing, but whether or not you can turn computation into ROI or business outcomes.
Fear Fuels GPU Overcapacity
Debo Ray, who started DevZero, a Seattle-based cloud infrastructure and developer productivity company, agreed that fear is the main reason companies are putting money into AI infrastructure that isn’t being used.

He told TechNewsWorld, “If you have one bad outage, teams overprovision.” “If you miss one GPU reservation, leaders panic-buy capacity.” People make provisioning judgments based on what happens, and no one goes back to them when the crisis is over.

He remarked, “We’ve seen clusters with 96 GPUs allocated running at 23% utilization, with 31 replicas sitting idle for 22 hours a day.” “Teams get called careless for this, but when there isn’t a feedback loop and no one is watching the gap, overprovisioning is the right thing to do.” The need to hoard is a direct response to the fear of not having enough, and that fear is partly real.

“When it’s really hard to get back capacity, it makes sense to hold on to it,” he said. “The structural problem is that the team that sets resource requests isn’t the same team that pays the cloud bill. This means that the padding never gets looked at again, the cluster autoscaler treats inflated requests as if they were real demand, and waste builds up quietly.”

Companies are hurt by idle GPUs. There is a cost. Ray said, “Companies are paying a lot of money for economy-class use.”

There is also a problem with what doesn’t get built. “AI infrastructure capacity that is there but not being used is not just waste. Ray added, “It’s an opportunity cost.” “Teams say they are waiting for access to GPUs so they can run tests.” They already have the capacity in their own clusters; they just don’t realize it.

He went on to say, “There’s also a reliability paradox that most people miss.” “The idea is that giving too much will keep you safe.” It often does the reverse.

Impact on Capital

Gerald Ramdeen, the founder, CEO, and CTO of Luxcore, a semiconductor and optical networking company in New York City, said that one of the worst things about idle GPUs is that they make capital less efficient and lower the returns on infrastructure spending.

He told TechNewsWorld that “these systems lose value quickly, but the costs of power, cooling, and data centers keep going up whether the GPUs are working or not.” “It also ties up money that could have gone into talent, data, or products.”

He went on to say, “More broadly, it distorts the market by making demand look bigger than actual use, which can lead to more overbuilding and even more defensive buying.”

Hoarding compute can also have an effect on AI as a whole. Ramdeen stated, “It puts power in the hands of the biggest players and makes it harder for startups, researchers, and smaller businesses to get in.” “That can make prices go up, slow down testing, and stifle innovation throughout the ecosystem.”

“It also makes a market where success depends too much on holding back supply and not enough on using it well,” he said. “That structure is not good for the industry in the long run.”
Better Management Needed Ramdeen said that some hoarding is reasonable and not bad. He remarked, “It’s a normal reaction to not knowing what will happen with supply.” “But in the long run, the companies that win in AI infrastructure won’t just be the ones that have the most GPUs.” The people who turn hardware into reliable, high-utilization computation with superior orchestration, networking, and economics will be the ones who win.

Lakshya Jain, a technology analyst in Irving, Texas, said that technology is not a problem when it is not used enough.

He told TechNewsWorld, “It’s a problem with how companies are set up.” “Businesses are still figuring out how to use AI.” They will keep buying more computer power than they need until they learn how to better manage their AI initiatives, keep track of their expenditures, and make sure everyone is on the same page.

Siddardha Vangala, co-founder and technical advisor at Tiered World Studios, a Salt Lake City-based games and immersive technology company, said, “The irony of AI compute hoarding is that it undermines the very outcomes these investments are supposed to drive.”

“Companies are betting on AI transformation at the board level, but their infrastructure teams have no targets for how much they should spend, no way to hold people accountable for costs, and no way to connect spending to production output,” he told TechNewsWorld.

He said, “The Cast AI data isn’t surprising to anyone who makes real AI systems.” “It’s only now becoming clear at the industry level.”

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button