EPAM Thinks You Should Rethink Your Data Stack for AI

EPAM is challenging the status quo of legacy data infrastructure, highlighting how traditional warehouses and batch-driven pipelines are becoming roadblocks to realising the full potential of generative AI.

To move from experimentation to enterprise-scale impact, organisations must rethink their data foundations for agility, real-time intelligence and AI-native design. In this shift, the company is pushing for intelligent data platforms and touchless engineering to power real-time AI agents.

At DES 2025, Srinivasa Rao Kattuboina, Senior Director and Head of the data and Analytics Practice at EPAM Systems Inc. (EPAM), delivered a compelling session arguing that the era of agentic AI demands a radical revamp of how data platforms are architected, moving from traditional batch processing toward real-time, intelligent, and open infrastructures.

“AI is no longer just an application layer,” Kattuboina said. “We must now look at data platforms themselves as intelligent systems that integrate AI at their core.”

Why Existing Platforms are Missing the Mark

Kattuboina noted that most current enterprise data platforms, built over decades through data warehouses, data lakes and lakehouses, are crumbling under the demands of generative AI and agentic systems.

These legacy systems, heavily reliant on batch processing, are unable to support real-time decision-making or autonomous agents that depend on fresh, clean, and reliable data to function effectively.

He described this as a transition from traditional platforms to what he calls intelligent data platforms. These systems are designed not just to store and manage data but also to automate insights, deliver real-time recommendations, and align closely with a company’s AI goals.

One of the standout points Kattuboina emphasised was “dark data”, which refers to enterprise data that has been collected but remains unused.

“Every time we build a model, we only look at a portion of our data,” he said. “Terabytes are sitting in lakes and warehouses, untouched. With agentic systems, even SQL queries can now explore that dark data.”

He argued that the advent of AI assistants and agent-based architectures means organisations can finally start tapping into this hidden potential. But to do that, the data must be real-time, accessible and intelligently integrated across the pipeline.

Rethinking the Stack: From Batch to Real-Time

The shift to agentic AI brings with it new technological imperatives. Kattuboina explained that traditional data engineering practices, like automated pipelines and metadata-driven orchestration, are no longer sufficient.

Instead, he proposed reconfiguring the data architecture, highlighting the need for real-time processing, open architectures, minimal layering and embedded intelligence across the pipeline. Technologies like Apache Iceberg, Flink and Kafka are increasingly becoming the backbone of this transformation.

“With the pace at which Iceberg is evolving, you may not even need Snowflake or Databricks in the future. Open formats and compute frameworks can do much of the heavy lifting,” he added. Such platforms could dramatically reduce AI implementation timelines—from months or years to just weeks.

The intelligent data platform, as envisioned by Kattuboina, automates not just data ingestion but also transformation, feature engineering, and MLOps workflows. “You connect your source, and your data is processed to the golden layer without manual intervention,” he said.

“You don’t need to insert metadata or orchestrate flows manually. That’s the level of intelligence we’re aiming for,” he added.

Eliminating Redundant Layers with AI Assistants

Kattuboina also explored enterprise use cases driving this shift. A typical scenario is the desire to replace thousands of static reports with a single AI assistant capable of querying real-time data.

However, such a vision is only feasible if the underlying platform is intelligent and nimble and built for AI from the ground up. “Everyone wants to eliminate the reporting layer with an assistant. But that requires the right data representation and infrastructure,” he emphasised.

Too often, organisations treat AI initiatives as separate threads, duplicating data into new stores rather than upgrading existing platforms. The key, he argued, is to bring intelligence into existing infrastructure so that it can serve both traditional analytics and emerging AI use cases.

To illustrate what’s possible, Kattuboina described a project implemented on AWS using Snowflake, Kafka, Flink, and Iceberg. The architecture enabled “touchless” data engineering, where engineers only had to configure table names and layer targets.

The system automatically took care of ingestion, transformation, and orchestration. “You just configure what you want to process,” he said. “The entire pipeline—using Flink, Kafka, and Iceberg—runs without human touch. That’s the future,” he added.

Kattuboina concluded with a call for nimble, simple, and real-time platforms that can integrate across multiple AI protocols and cloud ecosystems, from AWS to Azure and GCP. “We’re seeing tremendous pressure to deliver AI fast. The question is, do you need six months to deploy a model, or can you do it in a few weeks?”

For more information on EPAM’s data & analytics capabilities, visit https://welcome.epam.in/

The post EPAM Thinks You Should Rethink Your Data Stack for AI appeared first on Analytics India Magazine.

Why Existing Platforms are Missing the Mark

Rethinking the Stack: From Batch to Real-Time

Eliminating Redundant Layers with AI Assistants

Related Posts