NVIDIA Vera CPU: Performance compared to AMD and Intel x86 chips

HIGHLIGHTS

Vera delivers 1.5x faster sandbox performance over x86 competitors

Monolithic die design eliminates NUMA latency issues found in rivals

Early benchmarks show 5.5x lower streaming latency than Turin

NVIDIA Vera CPU: Performance compared to AMD and Intel x86 chips

As if NVIDIA wasn’t enjoying a lion’s share of the AI chips market, it just unveiled details about its Vera CPU to turn up the heat further on CPU chip makers AMD and Intel trying to shore up the datacentre market. 

Digit.in Survey
✅ Thank you for completing the survey!

At GTC 2026, NVIDIA revealed detailed specs and early benchmarks related to its much-hyped Vera CPU chip. Spoiler alert: We don’t know a lot from third parties yet, but here’s what we do know so far about how the NVIDIA Vera CPU stacks up against AMD EPYC Turin and Intel Xeon 6 Granite Rapids across early AI workloads, as revealed by NVIDIA themselves.

Vera CPU architecture overview

But before we get into the comparison and performance claims, here’s a quick overview of the new NVIDIA Vera CPU. It uses 88 custom Arm v9.2 “Olympus” cores with a feature called Spatial Multithreading, which physically partitions core resources rather than time-slicing them, yielding 176 threads total, according to NVIDIA’s release. 

This is a fundamentally different approach from AMD’s 192-core EPYC Turin (Zen 5, chiplet-based) and Intel’s up to 128-core Xeon 6980P Granite Rapids. Another key differentiator on the Vera CPU is that all 88 cores sit in a single monolithic compute die with no NUMA eccentricities, which is different from the chiplet-based x86 offerings from AMD and Intel. That means uniform latency and bandwidth to every core.

On the memory side, Vera CPU delivers up to 1.2TB/s of total bandwidth via LPDDR5X SOCAMM modules, with roughly 14GB/s per core – about 3x the per-core bandwidth of traditional datacentre CPUs, claims NVIDIA. For comparison, AMD’s top EPYC 9965 offers about 614 GB/s per socket, and Intel’s Granite Rapids sits lower still with 8-channel DDR5.

Vera CPU performance in a sandbox

This is where NVIDIA is making its loudest claims, that Vera CPU delivers up to 1.5x higher agentic sandbox performance under full-socket load compared to competitive x86 platforms – this is across compilers, scripting tools, runtime engines, compression, and all agentic tool calls. The benchmarks were run against AMD EPYC Turin and Intel Xeon 6 Granite Rapids specifically – that’s stated in the NVIDIA technical blog‘s footnotes.

Tom’s Hardware reports that NVIDIA claims a 1.5x IPC improvement over Grace, and performance gains of 1.8x to 2.2x over Grace in scripting, compilation, data analytics, graph analytics, and HPC workloads.

NVIDIA specifically suggests ETL, real-time analytics, and memory-bound workloads as key beneficiaries of the per-core bandwidth advantage of the Vera CPU. They say the chip’s design ensures throughput is maintained when every core is active. At the server rack level inside datacentres, the Vera CPU Rack claims 2x performance-per-watt over x86-based server racks for RL sandbox evaluation, ETL, and analytics under full system load.

Graph workloads

The Olympus core on the Vera CPU includes a custom graph database analytics prefetch engine, which NVIDIA claims is a hardware-level optimization specifically for graph traversal patterns – something neither AMD nor Intel currently offer at the ISA or prefetcher level. NVIDIA’s benchmarks claim graph analytics workloads where Vera leads by 1.8x–2.2x over Grace, and by roughly 1.5x margin over AMD and Intel counterparts.

Cross-core throughput and scaling

One area where Vera particularly shines is cross-core communication, according to NVIDIA, which is critical for data-parallel analytics. Redpanda‘s ring-shuffle benchmark found Vera delivered up to 73% higher cross-core throughput than AMD EPYC Turin. Perhaps more interestingly, Vera continued to scale performance beyond 64 cores, whereas other architectures flattened out after 32 cores due to memory bandwidth saturation. This is a direct consequence of the monolithic die and high-bandwidth fabric physical design of the NVIDIA Vera CPU chip.

A few things to remember

It’s important to remember that these results and comparisons are early NVIDIA-sourced or NVIDIA-partner benchmarks, and that independent benchmarks do not exist yet. Deployed Vera CPU cloud instances won’t be broadly available until late 2026. 

Reports suggest there’s a known hardware compatibility issue in the Vera’s PCIe controllers, as they trigger errors when paired with non-NVIDIA GPUs or third-party accelerators. This obviously limits its standalone appeal for any vendor that wants a mixed-OEM deployment – they’ll have to be fully locked into NVIDIA, from nuts and bolts to software stack. 

Vera CPU chip currently tops out at 88 cores versus AMD Turin’s 192 and Intel Granite Rapids’ 128. So for workloads that are purely dependent on core counts, the x86 chips still offer more raw parallelism compared to Vera.

Also worth noting: this is NVIDIA’s first generation of a custom CPU core, and it may face security and maturity challenges similar to what Intel experienced historically with hyperthreading.

AMD’s Turin remains the throughput-per-dollar champion for general-purpose server workloads, and Intel is betting on its upcoming 18A process with Clearwater Forest to beat that. With NVIDIA’s Vera CPU now into the mix, it will be a three-way fight worth watching closely through the rest of 2026 into 2027.

Also read: NVIDIA GTC 2026 conference: 5 things you should expect

Jayesh Shinde

Jayesh Shinde

Executive Editor at Digit. Technology journalist since Jan 2008, with stints at Indiatimes.com and PCWorld.in. Enthusiastic dad, reluctant traveler, weekend gamer, LOTR nerd, pseudo bon vivant. View Full Profile

Digit.in
Logo
Digit.in
Logo