

Companies are shelling out billions of dollars in the race to buy GPUs and scale data centres, but that hyperscale hinges on a fundamental assumption: powerful AI models need centralised compute power to train complex algorithms.
Now, some are challenging that very reality, hinting at the possibility that AI models may not need data centres at all.
Aravind Srinivas, CEO and co-founder of Perplexity, in a recent podcast with Prakhar Gupta, argued that the biggest threat to data centres is local intelligence, where applications do not depend on compute hosted remotely. In this model, compute shifts closer to the user, reducing reliance on a centralised data centre-based infrastructure.
Gavin Baker, CIO and managing partner at Atreides Capital, also echoed this view in a recent podcast. He imagined a future in which smartphones house more memory modules to accommodate pruned versions of frontier AI models, allowing users to access them without relying on cloud or high-end devices.
Baker pointed to Apple’s strategy, focused heavily on on-device, privacy-first AI rather than relying directly on powerful cloud-based models. That approach has improved privacy guarantees but limits massive data collection, contributing to Apple lagging in the broader AI ecosystem.
Small AI Models Accelerate the Shift
Efficient and increasingly capable small language models strengthen the on-device case. Google continues to build large frontier systems such as Gemini 3 while also shipping the Gemma family of models. Smaller Gemma variants can run locally, and their performance has consistently improved across benchmarks and evaluations with each release.
Paras Chopra, founder of AI lab Lossfunk, while testing a 270-million-parameter variant of Gemma, observed in a post on X that it was “absolutely wild how coherent and fast it is on my phone.”
Mobile applications such as PocketPal and Google AI Edge Gallery now allow users to download local models and experiment directly on smartphones. Google has also shipped on-device AI features across its Pixel lineup that do not rely on the cloud, prioritising speed and privacy.
Beyond phones, developers have experimented with modified versions of powerful open-source models running locally on MacBooks with Apple silicon or on a single consumer GPU, achieving cloud-comparable results for specialised workloads.
In the larger consumer device segment, NVIDIA is shipping compact Blackwell-based systems such as the DGX Spark workstation. PC manufacturers are also pushing AI PCs equipped with NPUs that can run AI workloads locally, albeit with limited features.
No, GPUs Will Not Die
The on-device thesis has its limits.
Research institute Epoch AI stated in a recent report that using a “top-of-the-line gaming GPU like NVIDIA’s RTX 5090 (under $2,500), anyone can locally run models matching the absolute frontier of LLM performance from just 6 to 12 months ago.”
“This relatively short and consistent lag means that the most advanced AI capabilities are becoming widely accessible for local development and experimentation in under a year,” the report added.

In a conversation with AIM, Sriram Subramanian, cloud computing analyst and founder of market research firm CloudDon, said he expects a mixed model, in which inference is split between the cloud and the device to improve performance. “The other angle is moving to smaller AI models where the requirements aren’t much for the user.”
“GPUs will be the larger pie definitely,” he declared, adding that powerful cloud-based compute will remain necessary for accuracy and high-demand workloads.
But then there’s the question of performance.
If users want the most accurate and contextually relevant responses, they may continue to prefer cloud-based GPUs, which will remain more powerful than on-device systems, even as local AI proves increasingly capable.
Minh Do, co-founder at Machine Cinema, a community of AI creatives, framed the trade-off in a post on X. “You wouldn’t expect a poorly performing AI but a cheaper AI if the expensive one can accurately diagnose your grandmother or get all your math problems right.”
Moreover, AI models will continue to scale performance across tasks and domains.
Edge Constraints
Rajesh C Subramaniam, founder and CEO of edge AI services company embedUR, told AIM that “what’s changing [with on-device AI] is where inference makes the most sense.”
He explained that many edge AI workloads are situational and triggered by on-screen context or real-world interactions. These benefit from local processing due to latency, privacy, and cost. “In those cases, pushing inference to the device is simply the more efficient architectural choice.”
“At the same time, the cloud remains essential for tasks such as large-scale model training, fleet-level analytics, coordination across devices, and continuous improvement of models,” he added.
Moreover, hardware economics remain a constraint. DRAM prices are rising, which is expected to increase the cost of smartphones and laptops equipped with cutting-edge memory components to handle AI workloads.
And eventually, memory pressure becomes acute for sensitive workloads such as facial recognition, payments, secure access, that involve data that should not leave the device. “In those scenarios, storing embeddings, reference data, and model parameters locally quickly becomes a challenge, especially on phones, laptops, or embedded platforms with strict memory budgets,” Subramaniam explained.
Security further shapes adoption. Carmen Li, CEO of Silicon Data, expressed that users may eventually be concerned over where their data is being processed—on the phone or in data centres. She noted that trust depends on hardware-backed security like chip encryption technologies and their continued advancement.
“Without that, you wouldn’t feel that comfortable… majority of the users will be concerned,” she added.
Subramaniam also pointed to a talent bottleneck. “Modern edge hardware is capable, but extracting performance requires deep expertise in optimisation, quantisation, and hardware-aware deployment,” he said.
Small models may look feasible in theory. Deploying them reliably at scale remains a tricky proposition.
The post Is Perplexity CEO Right About the Threat to AI Data Centres? appeared first on Analytics India Magazine.


