LAS VEGAS – Nvidia has provided a detailed look at its next-generation artificial intelligence platform, Vera Rubin, announcing at the CES 2026 tech conference that the first products will begin shipping in the second half of 2026. The platform is engineered to address the growing complexity of AI models and is already in production.
Company CEO Jensen Huang addressed the industry's rapid expansion, stating that the massive investments in AI are not a bubble but a fundamental shift in corporate spending. He explained that companies are reallocating research and development funds from traditional computing directly into artificial intelligence infrastructure.
Key Takeaways
- Nvidia's new AI platform, Vera Rubin, is in production and scheduled to ship in the latter half of 2026.
- The platform is designed to handle increasingly complex, multi-step AI tasks by focusing on advanced storage and context management.
- Major cloud providers including Microsoft, Amazon Web Services, and Google Cloud are slated to be early adopters.
- CEO Jensen Huang addressed AI bubble concerns, attributing the spending surge to a strategic shift in R&D budgets away from classical computing.
Addressing the Next AI Bottleneck
Nvidia's presentation at the Fontainebleau Las Vegas centered on a critical challenge for the future of AI: managing context. As AI evolves from simple chatbots into sophisticated assistants, or "agents," the demand on computing infrastructure is changing.
These advanced AIs must process vast amounts of information to perform multi-step tasks, a process that strains current hardware. According to Nvidia, the primary limitation is no longer raw processing power but the ability to manage and access data quickly.
"The bottleneck is shifting from compute to context management," said Dion Harris, Nvidia’s senior director of high-performance computing and AI hyperscale solutions, in a briefing. "Storage can no longer be an afterthought."
The Vera Rubin platform introduces a new type of storage system built to help AI models handle these context-heavy requests more efficiently. The company claims its flagship server rack, the Vera Rubin NVL72, will provide more bandwidth than the entire internet, a testament to its focus on data throughput.
From Chatbots to Reasoning Agents
CEO Jensen Huang, wearing his signature leather jacket, illustrated the technological leap with a video demonstration. In the presentation, a user configured a tabletop robot to act as a personal assistant by connecting it to multiple AI models running on an Nvidia DGX Spark desktop computer.
The robot performed tasks like reciting a to-do list and even instructing a dog to get off a couch. Huang noted that building such an assistant, once a monumental task, is now "utterly trivial" for developers using large language models instead of traditional code.
What is AI Inference?
Inference is the process an AI model uses to generate an answer or perform a task based on new input. Initially, this was a simple, one-step process like answering a question. Nvidia argues that for advanced AI agents, inference is becoming a multi-step "thinking process" where the model must reason through a problem, which requires a new class of hardware.
This shift toward more complex reasoning is at the core of the Vera Rubin architecture. "Instead of a one-shot answer, inference is now a thinking process," Huang explained on stage, emphasizing the need for systems that can support this continuous, iterative form of AI computation.
A Crowded and Competitive Market
Nvidia's dominance in the AI chip market has been a major story, propelling the company's valuation to briefly surpass $5 trillion last year. However, this success has invited intense competition and scrutiny.
Major technology companies, including many of Nvidia's largest customers, are developing their own custom AI chips to reduce their dependence on a single supplier. Google and OpenAI have been public about their efforts to create hardware tailored to their specific models.
According to a forecast by McKinsey & Company, global investment in data center infrastructure is expected to reach nearly $7 trillion by 2030, highlighting the immense capital flowing into the sector.
Rivals like AMD and Qualcomm are also intensifying their push into the data center market, creating a more competitive landscape. "Nobody wants to be beholden to Nvidia," commented Ben Barringer, global head of technology research at investment firm Quilter Cheviot, in a previous interview regarding the industry's diversification efforts.
Despite this, Nvidia announced a formidable list of launch partners for Vera Rubin, including:
- Cloud Providers: Microsoft, Amazon Web Services, Google Cloud, and CoreWeave.
- Computing Companies: Dell and Cisco.
- AI Labs: OpenAI, Anthropic, Meta, and xAI.
This broad adoption signals continued reliance on Nvidia's ecosystem, at least in the near term. The company also announced deeper investments into autonomous vehicles with new models like Alpamayo and a focus on "physical AI" for robotics, building on themes from its recent GTC conference.





