Sakana AI’s CTO says he’s ‘absolutely sick’ of transformers, the tech that powers every major AI model

In a striking act of self-critique, one of the architects of the transformer technology that powers ChatGPT, Claude, and virtually every major AI system told an audience of industry leaders this week that artificial intelligence research has become dangerously narrow — and that he’s moving on from his own creation.

Llion Jones, who co-authored the seminal 2017 paper “Attention Is All You Need” and even coined the name “transformer,” delivered an unusually candid assessment at the TED AI conference in San Francisco on Tuesday: Despite unprecedented investment and talent flooding into AI, the field has calcified around a single architectural approach, potentially blinding researchers to the next major breakthrough.

“Despite the fact that there’s never been so much interest and resources and money and talent, this has somehow caused the narrowing of the research that we’re doing,” Jones told the audience. The culprit, he argued, is the “immense amount of pressure” from investors demanding returns and researchers scrambling to stand out in an overcrowded field.

The warning carries particular weight given Jones’s role in AI history. The transformer architecture he helped develop at Google has become the foundation of the generative AI boom, enabling systems that can write essays, generate images, and engage in human-like conversation. His paper has been cited more than 100,000 times, making it one of the most influential computer science publications of the century.

Now, as CTO and co-founder of Tokyo-based Sakana AI, Jones is explicitly abandoning his own creation. “I personally made a decision in the beginning of this year that I’m going to drastically reduce the amount of time that I spend on transformers,” he said. “I’m explicitly now exploring and looking for the next big thing.”

Why more AI funding has led to less creative research, according to a transformer pioneer

Jones painted a picture of an AI research community suffering from what he called a paradox: More resources have led to less creativity. He described researchers constantly checking whether they’ve been “scooped” by competitors working on identical ideas, and academics choosing safe, publishable projects over risky, potentially transformative ones.

“If you’re doing standard AI research right now, you kind of have to assume that there’s maybe three or four other groups doing something very similar, or maybe exactly the same,” Jones said, describing an environment where “unfortunately, this pressure damages the science, because people are rushing their papers, and it’s reducing the amount of creativity.”

He drew an analogy from AI itself — the “exploration versus exploitation” trade-off that governs how algorithms search for solutions. When a system exploits too much and explores too little, it finds mediocre local solutions while missing superior alternatives. “We are almost certainly in that situation right now in the AI industry,” Jones argued.

The implications are sobering. Jones recalled the period just before transformers emerged, when researchers were endlessly tweaking recurrent neural networks — the previous dominant architecture — for incremental gains. Once transformers arrived, all that work suddenly seemed irrelevant. “How much time do you think those researchers would have spent trying to improve the recurrent neural network if they knew something like transformers was around the corner?” he asked.

He worries the field is repeating that pattern. “I’m worried that we’re in that situation right now where we’re just concentrating on one architecture and just permuting it and trying different things, where there might be a breakthrough just around the corner.”

How the ‘Attention is all you need’ paper was born from freedom, not pressure

To underscore his point, Jones described the conditions that allowed transformers to emerge in the first place — a stark contrast to today’s environment. The project, he said, was “very organic, bottom up,” born from “talking over lunch or scrawling randomly on the whiteboard in the office.”

Critically, “we didn’t actually have a good idea, we had the freedom to actually spend time and go and work on it, and even more importantly, we didn’t have any pressure that was coming down from management,” Jones recounted. “No pressure to work on any particular project, publish a number of papers to push a certain metric up.”

That freedom, Jones suggested, is largely absent today. Even researchers recruited for astronomical salaries — “literally a million dollars a year, in some cases” — may not feel empowered to take risks. “Do you think that when they start their new position they feel empowered to try their wild ideas and more speculative ideas, or do they feel immense pressure to prove their worth and once again, go for the low hanging fruit?” he asked.

Why one AI lab is betting that research freedom beats million-dollar salaries

Jones’s proposed solution is deliberately provocative: Turn up the “explore dial” and openly share findings, even at competitive cost. He acknowledged the irony of his position. “It may sound a little controversial to hear one of the Transformers authors stand on stage and tell you that he’s absolutely sick of them, but it’s kind of fair enough, right? I’ve been working on them longer than anyone, with the possible exception of seven people.”

At Sakana AI, Jones said he’s attempting to recreate that pre-transformer environment, with nature-inspired research and minimal pressure to chase publications or compete directly with rivals. He offered researchers a mantra from engineer Brian Cheung: “You should only do the research that wouldn’t happen if you weren’t doing it.”

One example is Sakana’s “continuous thought machine,” which incorporates brain-like synchronization into neural networks. An employee who pitched the idea told Jones he would have faced skepticism and pressure not to waste time at previous employers or academic positions. At Sakana, Jones gave him a week to explore. The project became successful enough to be spotlighted at NeurIPS, a major AI conference.

Jones even suggested that freedom beats compensation in recruiting. “It’s a really, really good way of getting talent,” he said of the exploratory environment. “Think about it, talented, intelligent people, ambitious people, will naturally seek out this kind of environment.”

The transformer’s success may be blocking AI’s next breakthrough

Perhaps most provocatively, Jones suggested transformers may be victims of their own success. “The fact that the current technology is so powerful and flexible… stopped us from looking for better,” he said. “It makes sense that if the current technology was worse, more people would be looking for better.”

He was careful to clarify that he’s not dismissing ongoing transformer research. “There’s still plenty of very important work to be done on current technology and bringing a lot of value in the coming years,” he said. “I’m just saying that given the amount of talent and resources that we have currently, we can afford to do a lot more.”

His ultimate message was one of collaboration over competition. “Genuinely, from my perspective, this is not a competition,” Jones concluded. “We all have the same goal. We all want to see this technology progress so that we can all benefit from it. So if we can all collectively turn up the explore dial and then openly share what we find, we can get to our goal much faster.”

The high stakes of AI’s exploration problem

The remarks arrive at a pivotal moment for artificial intelligence. The industry grapples with mounting evidence that simply building larger transformer models may be approaching diminishing returns. Leading researchers have begun openly discussing whether the current paradigm has fundamental limitations, with some suggesting that architectural innovations — not just scale — will be needed for continued progress toward more capable AI systems.

Jones’s warning suggests that finding those innovations may require dismantling the very incentive structures that have driven AI’s recent boom. With tens of billions of dollars flowing into AI development annually and fierce competition among labs driving secrecy and rapid publication cycles, the exploratory research environment he described seems increasingly distant.

Yet his insider perspective carries unusual weight. As someone who helped create the technology now dominating the field, Jones understands both what it takes to achieve breakthrough innovation and what the industry risks by abandoning that approach. His decision to walk away from transformers — the architecture that made his reputation — adds credibility to a message that might otherwise sound like contrarian positioning.

Whether AI’s power players will heed the call remains uncertain. But Jones offered a pointed reminder of what’s at stake: The next transformer-scale breakthrough could be just around the corner, pursued by researchers with the freedom to explore. Or it could be languishing unexplored while thousands of researchers race to publish incremental improvements on architecture that, in Jones’s words, one of its creators is “absolutely sick of.”

After all, he’s been working on transformers longer than almost anyone. He would know when it’s time to move on.

Why more AI funding has led to less creative research, according to a transformer pioneer

How the ‘Attention is all you need’ paper was born from freedom, not pressure

Why one AI lab is betting that research freedom beats million-dollar salaries

The transformer’s success may be blocking AI’s next breakthrough

The high stakes of AI’s exploration problem

Related Posts