GTC 2026 ran from March 16 to 19 in San Jose. Jensen Huang delivered the keynote on March 16 to a sold-out arena with 450 companies sponsoring the event, 2,000 speakers, and 1,000 technical sessions. The address ran just over two hours and marked CUDA’s 20th anniversary.
Six major themes ran through the keynote:
- Vera Rubin computing platform: the successor to Blackwell, in production now, designed from the ground up for inference and agentic workloads
- OpenClaw and NemoClaw: an open-source agentic AI framework Jensen compared to Linux, now with an enterprise-secure reference design for corporate deployment
- The $1 trillion demand signal: Jensen raised the AI compute demand outlook from $500 billion through 2026 to $1 trillion through 2027, driven by the inference inflection
- Open frontier models: Nemotron 3, Cosmos 2, Groot 2, and Alpamayo covering language, robotics, autonomous vehicles, and physical simulation, plus a new partner coalition for Nemotron 4
- Physical AI: BYD, Hyundai, Nissan, and Geely joining the RoboTaxi platform, 110 robots on the show floor, and a live Disney Olaf robot powered by NVIDIA Newton physics simulation
- DLSS 5: AI-powered neural rendering for real-time graphics, bringing a generational leap in visual fidelity to games and simulation
Below, we cover: the Vera Rubin hardware platform, the inference economy and token factory concept, the OpenClaw and NemoClaw agentic moment, physical AI advances, and what GTC 2026 means for data leaders.
The new AI infrastructure: Vera Rubin and what comes after
Permalink to “The new AI infrastructure: Vera Rubin and what comes after”Jensen opened with CUDA’s 20-year origin story as strategy, not nostalgia. The installed base of hundreds of millions of CUDA-enabled GPUs creates a self-reinforcing flywheel: new algorithms drive new markets, which grow the installed base, which attracts more developers. That flywheel is now accelerating faster than at any prior point.
This year’s hardware story centers on Vera Rubin, and what is on the roadmap after it.
1. Vera Rubin: what was announced
Permalink to “1. Vera Rubin: what was announced”The Vera Rubin platform is in production now. It packages seven chip types into five rack-scale computers that operate as a single AI supercomputer: Vera CPUs, Rubin GPUs, NVLink 6 switches, ConnectX-9 NICs, BlueField 4 DPUs, Spectrum-X co-packaged optical NICs, and Groq 3 LPUs. The headline specs: 3.6 exaflops of compute and 260 terabytes per second of all-to-all NVLink bandwidth.
Third-party analysis from Semi Analysis confirmed and exceeded Jensen’s own performance claims, finding roughly 50x more tokens per watt compared to Hopper H200. Jensen noted that analyst Dylan Patel “accused me of sandbagging.” He was right.
The system is 100% liquid-cooled with hot-water cooling at 45 degrees Celsius, which removes the cost of air-cooling infrastructure from the data center. Installation time dropped from two days to two hours.
2. The Groq 3 LPU: why NVIDIA acquired Groq
Permalink to “2. The Groq 3 LPU: why NVIDIA acquired Groq”NVIDIA acquired the Groq team and technology in late 2025. The Groq 3 LPX chip is purpose-built for inference: deterministic, statically compiled, with massive on-chip SRAM. It excels at the decode phase of inference, where bandwidth and token generation speed constrain throughput.
Combined with Vera Rubin’s prefill strength, the two processors are tightly coupled through NVIDIA Dynamo, a software layer that disaggregates inference across the two architectures. Dynamo sends the prefill to Vera Rubin, and the decode to the Groq chip. The result: 35x more throughput per megawatt compared to Blackwell alone, with a new performance tier for high-speed token generation that was previously unreachable by any single architecture.
3. The Feynman roadmap
Permalink to “3. The Feynman roadmap”Jensen sketched the 2028 roadmap. The Feynman family brings a new GPU, the LP40 LPU built jointly with the Groq team, the Rosa CPU, BlueField 5, and Kyber-CPO scale-up using co-packaged optics. NVIDIA is also developing Vera Rubin Space-1 for orbital data centers, where thermal management works through radiation.
The pattern across generations: one new architecture per year, each maintaining full backward compatibility. The longer an installation runs, the lower the effective cost per token. This is NVIDIA’s argument for supporting every deployed GPU indefinitely through software updates.
The token economy: why Jensen called AI a new commodity
Permalink to “The token economy: why Jensen called AI a new commodity”Jensen spent significant time on a single economic equation: your data center is now a token factory, and token throughput per watt is your revenue. This reframe matters because it changes how CEOs, CFOs, and data leaders should evaluate AI infrastructure.
1. The inference inflection
Permalink to “1. The inference inflection”The reasoning behind the trillion-dollar projection is concrete. In the last two years, the compute demand of individual AI workloads increased roughly 10,000 times as reasoning models replaced retrieval-based systems. Usage simultaneously grew about 100 times. Jensen’s conclusion: total AI computing demand has increased approximately 1 million times in two years. AWS has reflected the same signal with an expanded NVIDIA partnership to meet surging inference demand.
That makes inference, not training, the dominant workload. AI systems now spend most of their compute cycles generating tokens: reasoning, using tools, writing and executing code.
2. The tiered token market
Permalink to “2. The tiered token market”Jensen presented an inference market tiered like SaaS: free-tier tokens at one end, premium research tokens at $150 per million at the other. The upgrade from Blackwell to Vera Rubin shifts the entire portfolio up-market by 5 to 10x from the same power budget.
Platform inference providers saw token generation speed rise from roughly 700 to nearly 5,000 tokens per second after NVIDIA updated their software on existing hardware. That is a 7x revenue multiplier without procuring a single new chip.
3. The AI factory platform: NVIDIA DSX
Permalink to “3. The AI factory platform: NVIDIA DSX”NVIDIA DSX is the answer to AI factory design at scale. It is an Omniverse-based digital twin that lets data center designers simulate physical, thermal, electrical, and network conditions before construction begins. DSX MaxQ then dynamically optimizes token throughput against available power once the data center is live.
Jensen argued that a factor of 2 improvement in effective token output is available inside existing data centers through better power and thermal management, without adding a single chip. At the scale of gigawatt data centers, that represents billions in recovered revenue.
OpenClaw and NemoClaw: the enterprise agent OS moment
Permalink to “OpenClaw and NemoClaw: the enterprise agent OS moment”The agentic AI announcement drew the strongest reaction. Jensen compared it directly to the launch of Linux and HTTP, the moments when entire computing eras crystallized around a single open standard.
1. What OpenClaw is
Permalink to “1. What OpenClaw is”OpenClaw is an open-source agentic framework by Peter Steinberger. Jensen described it as the fastest-growing open-source project in history, surpassing Linux’s 30-year adoption in weeks. Its primitives map directly to an operating system: resource management, tool access, file system access, LLM connectivity, scheduling, and sub-agent spawning.
His summary: “OpenClaw has open-sourced essentially the operating system of agentic computers.” He compared it to Linux, HTTP, and Kubernetes: each one defined the rules of a computing era. Every enterprise now needs an OpenClaw strategy for the same reason every enterprise once needed a Kubernetes strategy.
2. The enterprise problem OpenClaw introduces
Permalink to “2. The enterprise problem OpenClaw introduces”OpenClaw’s default capabilities present a compliance and security challenge that is not theoretical. An autonomous agent that can access sensitive information, execute code, and communicate externally is a significant risk inside a corporate network.
Jensen stated this plainly during the keynote: an agent with full OpenClaw capabilities can access employee records, supply chain data, and financial information, and send it outside the organization. For enterprises, that gap between capability and control needed to be closed before adoption could scale. The AI governance problem this creates is distinct from traditional application security.
3. NemoClaw: the enterprise reference design
Permalink to “3. NemoClaw: the enterprise reference design”NVIDIA’s answer is NemoClaw, a reference design built on OpenClaw with three security layers: OpenShell runtime sandboxing, a privacy router, and network guardrails. Every SaaS provider can connect their policy engines to the NemoClaw layer, making agent behavior configurable without rewriting the agent. The stack is hardware-agnostic and open-source.
Jensen’s forecast: every SaaS company will become an “agentic as a service” company, and every engineer will carry an annual token budget alongside their salary. Tokens are the amplifier for human productivity.
Physical AI: autonomous vehicles, robots, and simulation
Permalink to “Physical AI: autonomous vehicles, robots, and simulation”GTC 2026 showed the first signs of physical AI at commercial scale across two verticals: autonomous vehicles and general-purpose robotics.
1. The RoboTaxi Ready platform
Permalink to “1. The RoboTaxi Ready platform”BYD, Hyundai, Nissan, and Geely joined the existing partners (Mercedes, Toyota, and GM) on the NVIDIA RoboTaxi Ready platform. These seven manufacturers together produce roughly 18 million vehicles per year. NVIDIA also announced a partnership with Uber to deploy these vehicles across multiple cities.
The milestone Jensen emphasized: “The ChatGPT moment of self-driving cars has arrived.” NVIDIA’s Alpamayo model now gives vehicles the ability to reason, narrate their decisions in natural language, and follow passenger instructions. The keynote showed a demonstration of a vehicle describing a lane change, explaining how it handled a double-parked obstacle, and adjusting speed on request.
2. The robotics simulation stack
Permalink to “2. The robotics simulation stack”The robotics stack includes four open-source components: Isaac Lab (training and evaluation), Newton (GPU-accelerated differentiable physics, co-developed with DeepMind and Disney), Cosmos World Models (neural simulation for synthetic data), and Groot 2 (reasoning and action model for general-purpose robots).
110 robots were present at GTC. The keynote closed with Disney’s Olaf from Frozen walking on stage and conversing with Jensen, trained entirely inside Omniverse using Newton simulation.
3. Open models for every vertical
Permalink to “3. Open models for every vertical”NVIDIA released frontier models across six domains. Nemotron 3 covers language, visual understanding, RAG, safety, and speech. Cosmos 2 handles world simulation. Groot 2 addresses robotics. Alpamayo powers autonomous vehicles. BioNemo targets biology and drug discovery. Earth-2 focuses on weather and climate forecasting.
The Nemotron Coalition, including Cursor, Langchain, Mistral, Perplexity, Sarvam, and Black Forest Labs, joined NVIDIA to co-develop Nemotron 4 as a shared open foundation for domain-specific and sovereign AI.
4. DLSS 5: neural rendering goes mainstream
Permalink to “4. DLSS 5: neural rendering goes mainstream”NVIDIA also announced DLSS 5, the next generation of its AI-powered rendering technology. DLSS 5 uses neural networks to generate entire frames rather than just upscaling them, delivering a step change in visual fidelity for games and real-time simulation. Jensen positioned it as proof that AI is not confined to data centers: it runs locally on consumer GPUs, transforming every gaming PC into an AI inference engine. For enterprises, the same neural rendering pipeline powers Omniverse-based digital twins and DSX factory simulations, connecting the consumer GPU ecosystem directly to industrial AI applications.
What GTC 2026 means for data leaders
Permalink to “What GTC 2026 means for data leaders”The hardware and model announcements dominate coverage. For data and analytics leaders, the more consequential thread ran quieter through the session.
1. Structured data is back at the center
Permalink to “1. Structured data is back at the center”Jensen’s architecture talk was also a data talk. He devoted extended time to the “five-layer cake” of AI, with structured data (SQL, Spark, Pandas, and the major cloud data warehouses) sitting at the foundation. The argument: structured data is the “ground truth of business,” and generative AI needs that ground truth to be trustworthy before it can be reliable.
The cuDF library (GPU-accelerated data frames) and cuVS library (GPU-accelerated vector stores) are NVIDIA’s infrastructure answer. The IBM watsonx.data integration showed a concrete result: Nestlé refreshed a supply chain data mart five times faster at 83% lower cost on NVIDIA GPUs. Google Cloud’s BigQuery acceleration cut a major customer’s compute costs by nearly 80%. The same pattern applies across every SQL workload currently running on CPUs.
This matters for data leaders because it signals acceleration, not disruption. The SQL-based structured data layer your team manages becomes a faster, higher-frequency feed into AI systems. Data governance frameworks built around structured data remain essential; they just need to handle higher velocity.
But Jensen did not stop at structured data. He called unstructured data “useless today” because it cannot be searched, queried, or indexed the way SQL databases can. Then he made the pivot: multimodal AI changes that equation entirely. If an AI agent can read, interpret, and index video, PDFs, Slack threads, and engineering documents, then unstructured data becomes structured. The entire corpus of enterprise knowledge, not just what lives in a data warehouse, becomes queryable.
He gave a concrete example: an AI agent that can read a company’s entire document history and answer questions about it, find patterns across years of unstructured records, flag risks buried in contract language, or surface insights from customer call transcripts that no analyst has time to review manually. That is not a theoretical capability. It is what multimodal models with enterprise access will do at production scale within the next architecture generation.
For data leaders, this expands the governance surface area dramatically. The same metadata layer that tracks lineage and ownership for structured datasets needs to extend to vector stores, document indexes, and multimodal embeddings. An AI agent that can read everything is only trustworthy if you know what it read, where that data came from, and whether it was approved for AI use. Organizations building that foundation now, cataloging and classifying both structured and unstructured assets under a unified context layer, will be ready when the capability arrives. The rest will be retrofitting governance after agents are already loose in their document stores.
2. Agents will access your data, and governance is not optional
Permalink to “2. Agents will access your data, and governance is not optional”The NemoClaw announcement is the most direct signal for data teams. NemoClaw enables autonomous agents to access file systems, execute code, query databases, and communicate across enterprise applications. Jensen named the challenge plainly: agents can access employee information, supply chain data, and financial records, and send it out.
OpenShell addresses network security. It does not address the data governance question underneath: which data can an agent read? Which data lineage paths does it traverse? Who authorized it to act on a particular dataset, and when was that authorization last reviewed?
NemoClaw’s existence is itself validation. NVIDIA is telling the market that agent governance is a production requirement, not a future consideration. But network containment is only the outer wall. The semantic layer, knowing which tables carry PII, which datasets have been certified for AI consumption, and which lineage paths are trustworthy, requires a metadata foundation that agent frameworks can query at runtime.
The organization with that metadata in place is the one ready to respond when agent frameworks like NemoClaw need real policies to enforce. The organization without it will face the problem as an emergency when agents are already deployed. The common context problems data teams face when building agents do not disappear with better hardware; they become more urgent.
3. Token budgets make data quality an inference problem
Permalink to “3. Token budgets make data quality an inference problem”Jensen predicted that engineers will soon carry an annual token budget alongside their salary. Recent McKinsey research on enterprise AI adoption reinforces this: AI compute is becoming a budgeted resource like cloud spend, not a discretionary experiment.
That reframes data quality as an inference efficiency problem. Every token an agent spends reasoning over stale, undocumented, or ungoverned data is wasted compute. An agent querying a well-documented table with clear ownership, freshness signals, and business context reaches a reliable answer in fewer reasoning steps than one navigating ambiguous schemas and conflicting definitions. Governance guardrails, semantic context, end-to-end lineage, and tribal knowledge encoded as metadata all reduce the token cost of a correct answer.
The context layer is what converts token spend into reliable output. Real-time quality signals tell an agent whether a dataset is fresh. Machine-readable governance policies tell it what data it can and cannot use. Lineage and provenance tell it where a number came from and whether the pipeline that produced it is healthy. A data catalog that surfaces business context and ownership is one component of that layer, but the broader requirement is an operational metadata fabric that gives agents continuous signals about what to trust, what to avoid, and what approvals are needed before acting. Without that fabric, token budgets fund noise rather than signal. The data governance vs. AI governance distinction matters here: you need both, and neither substitutes for the other.
4. AI factories create a governance demand problem
Permalink to “4. AI factories create a governance demand problem”Jensen’s AI factory vision, where every enterprise operates a token production facility with DSX managing power, cooling, and throughput, introduces a question the factory metaphor does not answer: how do agents inside that factory know what data to use, trust it, and stay compliant while using it?
A factory that produces tokens at gigawatt scale without a governance layer is a factory that produces errors at gigawatt scale. The faster inference runs, the more damage an ungoverned agent can do before a human catches it. DSX optimizes the supply side of token production. The demand side, ensuring that the data feeding those tokens is classified, governed, and auditable, is a separate infrastructure problem. That is the gap data leaders need to close before AI factories reach full production.
Atlan and the governance layer for the agentic era
Permalink to “Atlan and the governance layer for the agentic era”GTC 2026 surfaced three governance problems that compound on each other. NemoClaw showed that agents need network-level containment. The unstructured data pivot showed that the scope of governable assets is about to expand beyond SQL tables. And the AI factory vision showed that ungoverned inference at scale produces errors at scale. Network guardrails address the first problem. They do not address the second or third.
The missing piece is a metadata and context layer that sits underneath the agent runtime: one that knows what data exists across the estate, who owns it, what is certified for AI use, and what lineage path connects a source table to a downstream decision. NemoClaw’s policy engine architecture is designed to query external systems for exactly these signals. The question for every enterprise is whether that metadata layer exists and is machine-readable when agents come asking.
This is the problem Atlan’s context graph and active metadata architecture are built to address. A unified graph across tables, pipelines, models, and increasingly, vector stores and document indexes, gives governance teams a single surface to manage as the scope of AI-accessible data expands. As multimodal agents begin reaching into unstructured data, the governance requirements do not change in kind; they change in surface area. Lineage tracking, classification, ownership, and trust signals need to extend from data warehouses to embedding pipelines and retrieval indexes.
The opportunity for organizations that invest in this foundation now is significant. When NemoClaw or any agent framework asks “can this agent access this dataset?”, the answer should come from classification labels, ownership records, and certification status maintained in real time, not from a spreadsheet last updated two quarters ago. AI governance tools that operate at inference speed, not audit speed, will define which organizations capture the productivity gains Jensen spent two hours describing.
Book a demo to see how Atlan helps enterprises build the governance foundation that agentic AI requires.
Conclusion
Permalink to “Conclusion”GTC 2026 was a strategic declaration as much as a product announcement. Jensen Huang spent two hours making a single argument: AI is now an industrial production system, tokens are the output, and the entire enterprise stack needs to be redesigned around that fact.
For data leaders, the most urgent takeaway is not which GPU to procure. It is that the scope of governable data is expanding (structured and unstructured), the speed of agent access is accelerating (inference-scale, not batch-scale), and the cost of ungoverned AI is now measured in wasted token budgets and compliance exposure, not just bad dashboards. NemoClaw validates that agent governance is a production requirement. The AI factory vision makes governance a prerequisite for operating at scale.
The organizations that build that governance foundation now, with classified data, clear ownership, end-to-end lineage, and a context layer that agents can query in real time, will capture the productivity gains the rest of the keynote was selling. Book a demo to see how Atlan supports enterprise AI readiness.
FAQs about NVIDIA GTC 2026 keynote recap
Permalink to “FAQs about NVIDIA GTC 2026 keynote recap”1. What is Vera Rubin NVIDIA?
Permalink to “1. What is Vera Rubin NVIDIA?”Vera Rubin is NVIDIA’s next-generation AI computing platform, now in production. It delivers 3.6 exaflops of compute and 260 terabytes per second of NVLink 6 bandwidth across 72 GPUs, with roughly 50x more tokens per watt compared to Hopper H200.
2. What is NemoClaw?
Permalink to “2. What is NemoClaw?”NemoClaw is NVIDIA’s enterprise-grade reference design built on OpenClaw. It adds three security layers: OpenShell runtime sandboxing, a privacy router, and network guardrails. It is hardware-agnostic and open-source, designed to connect to any existing enterprise policy engine so organizations can govern agent behavior with their own compliance rules.
3. What is OpenClaw?
Permalink to “3. What is OpenClaw?”OpenClaw is an open-source agentic AI framework by Peter Steinberger. Jensen described it as the fastest-growing open-source project in history, enabling agents to access files, connect to LLMs, use tools, schedule tasks, and spawn sub-agents. He compared it to Linux as an operating system for the agentic era.
4. How does Vera Rubin compare to Blackwell?
Permalink to “4. How does Vera Rubin compare to Blackwell?”Vera Rubin delivers roughly 50x more tokens per watt than Blackwell H200, confirmed by Semi Analysis. Combined with the Groq 3 LPX chip through NVIDIA Dynamo, the system delivers 35x more throughput per megawatt and enables a new tier of high-speed token generation for latency-sensitive workloads.
5. What did NVIDIA announce at GTC 2026 for enterprise data?
Permalink to “5. What did NVIDIA announce at GTC 2026 for enterprise data?”NVIDIA announced cuDF and cuVS integrations with IBM watsonx.data, Google BigQuery, and Dell AI Data Platform. The IBM partnership demonstrated a 5x speed increase and 83% cost reduction for Nestlé’s supply chain workloads. NVIDIA also announced NemoClaw for enterprise agent governance and DSX for AI factory management.
6. What is NVIDIA Feynman?
Permalink to “6. What is NVIDIA Feynman?”Feynman is NVIDIA’s next architecture after Vera Rubin, targeting 2028. It includes a new GPU, the LP40 LPU (built with the Groq team), the Rosa CPU, BlueField 5, and both copper and co-packaged optical scale-up networking.
Share this article
