A researcher from the highly valued and secretive startup Thinking Machines Lab has publicly challenged the prevailing strategy in artificial intelligence development. Instead of building ever-larger models, the company argues the key to true intelligence is creating systems that can learn and improve on their own.
Speaking at the TED AI conference in San Francisco, reinforcement learning researcher Rafael Rafailov outlined a vision that diverges sharply from the path taken by industry giants like OpenAI and Google. He suggested that the first superintelligence will not be a massive, all-knowing model, but rather a "superhuman learner."
Key Takeaways
- Thinking Machines Lab argues against the industry's focus on simply scaling up AI model size and data.
- The company believes the next breakthrough will come from AI that can genuinely learn from experience, not just be trained on static datasets.
- Researcher Rafael Rafailov proposed that the first superintelligence will be a "superhuman learner" capable of self-improvement.
- This approach, known as meta-learning, aims to reward an AI's progress and ability to learn, rather than just its success on a single task.
The Problem with Today's AI
Current artificial intelligence systems, even the most advanced ones, have a fundamental limitation, according to Rafailov. They don't truly learn or retain knowledge from their interactions. He highlighted a common experience with AI coding assistants to illustrate the point.
"If you use a coding agent, ask it to do something really difficult... it might be successful," he explained. "And then come back the next day and ask it to implement the next feature, and it will do the same thing."
The core issue is that these models lack the ability to internalize information and adapt their behavior over time. Each new task is treated as a fresh start, with no memory of past successes or failures. "In a sense, for the models we have today, every day is their first day of the job," Rafailov stated.
Learning vs. Training
Rafailov drew a sharp distinction between two concepts often used interchangeably in AI. "Learning is something an intelligent being does," he said. "Training is something that's being done to it." This difference is central to Thinking Machines' philosophy, suggesting that current models are passive recipients of training rather than active learners.
This training method leads to undesirable shortcuts. Rafailov noted that coding agents often wrap uncertain code in `try/except` blocks, a practice he compared to using "duct tape." This happens because the AI's only objective is to complete the immediate task, not to build robust, generalizable knowledge. It's a system that incentivizes "kicking the can down the road."
A New Approach to AI Development
Instead of continuing to pour resources into scaling, Rafailov proposed a shift in focus toward teaching AI how to learn. He argued that simply making models bigger will not be enough to achieve artificial general intelligence (AGI).
"I believe that under our current paradigms, under any scale, we are not enough to deal with artificial general intelligence and artificial superintelligence," he asserted. "Our current models will lack one core capability, and that is learning."
To explain the alternative, he used an analogy from education. Current AI training is like giving a student a single, difficult math problem. Once solved, any insights or methods the model discovered are discarded. For the next problem, it must start from scratch.
A more effective method would be to treat the AI like a real student.
"Instead of giving our models a single problem, we might give them a textbook," Rafailov suggested. The AI would work through chapters and exercises sequentially, building knowledge incrementally. The goal would change from simply solving problems to something more profound. "Instead of rewarding their success... we need to reward their progress, their ability to learn, and their ability to improve."
This concept, known in the field as "meta-learning," has been successful in smaller-scale applications, such as the game-playing AI AlphaGo. The challenge is adapting it to the massive scale of modern foundation models.
The 'Superhuman Learner'
Rafailov believes the necessary ingredients for this shift are already largely in place. The problem isn't a lack of architectural innovation but a need for the right kind of data and better objectives for the AI to pursue.
"I think we just don't have the right data, and we don't have the right objectives," he said, describing the answer as "quite prosaic." The goal is to create training environments where learning, exploration, and self-improvement are essential for success.
The ultimate aim is for an AI to develop a "general purpose learning algorithm" on its own, emerging from large-scale training designed to foster this capability.
This vision leads to a different kind of superintelligence than often portrayed in popular culture. It would not be a static, god-like oracle with all the answers. Instead, it would be a dynamic entity defined by its capacity for growth.
"I believe that the first superintelligence will be a superhuman learner," Rafailov concluded. "It will be able to very efficiently figure out and adapt, propose its own theories, propose experiments, use the environment to verify that, get information, and iterate that process."
The $12 Billion Bet
This technical vision is backed by significant financial investment. Thinking Machines Lab, co-founded by former OpenAI Chief Technology Officer Mira Murati, raised an unprecedented $2 billion in seed funding at a $12 billion valuation.
The company has assembled a team of around 30 top researchers from leading AI labs. However, it faces intense competition and challenges. In October, co-founder Andrew Tulloch departed to return to Meta, which reportedly attempted to recruit more than a dozen of the startup's employees with massive compensation packages.
Despite these pressures, the company appears committed to its unique research path. Its first product, Tinker, an API for fine-tuning open-source models, launched in October, but Rafailov's presentation suggests this is merely a stepping stone toward a far more ambitious goal.
He acknowledged the difficulty of the task ahead, stating it would require breakthroughs in memory, engineering, and data. Yet, he remained confident that creating a true learning machine is "fundamentally possible." For the AI industry, the question is whether this long-term bet on learning can outperform the brute-force scaling approach of its bigger rivals.





