sapphire rapids

Intel Xeon ‘Sapphire Rapids’ vs AMD Ryzen Threadripper Pro

1430 0

Intel has launched its long awaited ‘Sapphire Rapids’ workstation processors, but do they have enough to surpass AMD’s Ryzen Threadripper Pro? Greg Corke puts these high-end CPUs through their paces


Ten years ago, it would have been unthinkable that Intel today would be playing catchup with AMD in workstation processors. But, the overwhelming success of AMD Ryzen Threadripper Pro, coupled with Intel’s failure to launch a true workstation-class processor since 2019, has led us to this precise situation. Intel desperately needs its new ‘Sapphire Rapids’ Xeon processors — specifically the Intel Xeon W-2400 and W-3400 — to be a success.

Over 40 pages of dedicated workstation reviews, features and coverage. (Click image to read)

The chip giant certainly has its work cut out here. With Threadripper Pro, AMD delivered the holy grail of workstation processors, combining vast numbers of cores (up to 64) with high turbo frequencies and high-memory bandwidth to deliver impressive performance wherever your workflows may take you — single threaded CAD, multithreaded rendering, or memory intensive simulation, Threadripper Pro can handle pretty much anything you throw at it.

Not surprisingly, Intel has followed a similar tack for its new ‘Sapphire Rapids’ workstation processors — up to 56-cores, up to 4.8 GHz turbo and 8-channel DDR5 memory. It also follows AMD in terms of architecture. Like Threadripper Pro, ‘Sapphire Rapids’ processors feature a ‘chiplet’ design where several smaller chips are packaged together as one. This is in contrast to traditional monolithic designs, where all cores are on a single chip, making it more prone to manufacturing defects, and therefore lower yields and higher cost.

Intel has a much wider workstationfocused product range than AMD, with a total of fifteen models across its Intel Xeon W-2400 and W-3400 series (see charts below). In contrast, there are only six “Zen 3” Ryzen Threadripper Pro 5000 WX-Series models, sporting 12, 16, 24, 32 or 64 cores. All have 8-channel DDR4 3200 memory.


 

Advertisement
Advertisement

 




Intel Xeon W-2400 / W-3400

Intel differentiates its Xeon W-2400 and Xeon W-3400 processor families in two main ways: by number of cores and by memory channels.

The Xeon W-2400 Series is classified as a ‘mainstream’ workstation processor with eight models ranging from 6 to 24 cores and 4-channel DDR5 4800 memory.

Meanwhile, the Intel Xeon W-3400 Series is for ‘experts’ with seven models ranging from 12 to 56 cores and 8-channel DDR5 4400/4800 memory.

The new processors are comprised entirely of ‘Golden Cove’ cores — they do not have the hybrid Performance Core (P-Core) / Efficiency Core (E-core) architecture pioneered by 12th Gen and 13th Gen Intel Core processors.

‘Golden Cove’ is not Intel’s latest CPUarchitecture. It formed the foundation for the P-Cores in 12th Gen Intel Core.

Beyond the cores, there are some other significant differences between the two processor families. Compared to the Intel Xeon W-2400, the Intel Xeon W-3400 has more memory capacity (4 TB vs 2 TB), more PCIe lanes (112 vs 64) (so it can support more add-in GPUs), more Intel Smart Cache (L3), and a higher max base power (350W vs 225W).

As a first for Xeon processors, certain models — those with an X suffix — are unlocked so the processor can be overclocked. A range of tuning features are available through the Intel Extreme Tuning Utility (Intel XTU).

While it’s highly unlikely that major OEMs will ever go down the overclocking route, this level of control could leave the gates open for specialist workstation manufacturers to differentiate themselves by squeezing more performance out of the platform. This might be one for the future, however. Currently, there are no off-the shelf All-in-One (AIO) water coolers that we know of for the power-hungry processors, although UK firm Armari has developed a custom liquid cooling solution for its Intel Xeon W-3400 rack workstation (see bottom of article).

Among the Intel Xeon W-2400 Series, the processors that stand out are the Xeon w7-2495X and w7-2475X which combine high core counts with the highest boost frequencies. The lower-end models may be suited to certain Finite Element Analysis (FEA) or other simulation tools that benefit from higher memory bandwidth but can’t necessarily take advantage of large numbers of cores. They can also provide a platform for multi-GPU workflows, such as GPU rendering.

There’s a similar pattern with the Intel Xeon W-3400 Series, with the higher end models featuring the largest number of cores and highest boost frequencies. The range tops out with the 56-core Intel Xeon w9-3495X with a base frequency of 1.9 GHz and a Turbo Boost Max 3.0 of 4.80 GHz.

The lower-end CPUs in the family, such as the Intel Xeon w5-3425, could offer similar potential benefits for engineering simulation, plus support for even more GPUs. You can see the full specs in the tables above.

Meanwhile, Xeon W-2400 and Xeon W-3400 supports the latest technologies, including PCIe Gen 5, DDR5 4400/4800 memory (which offers more memory bandwidth than Threadripper Pro’s DDR4 3200) and Intel WiFi 6E.

While the majority of workstations focus on the single socket, high core count Intel Xeon W-2400 and Xeon W-3400 Series, ‘Sapphire Rapids’ does not spell the end for dual processor workstations.

4th Gen Intel Xeon Scalable processors, which are primarily designed for servers, have already made their way into workstations from HP and Lenovo. The top-end model, the Intel Xeon Platinum 8490H, offers 60-cores per processor, which gives you a whopping 120 cores in a dual socket workstation. However, among the major OEMs, you’ll only see this chip in the Lenovo ThinkStation PX (read our review) and, at $17,000 per processor, the market it somewhat limited. The HP Z8 G5 also comes with 4th Gen Intel Xeon Scalable processors, but only those models with up to 32-cores.

Test setup

For our testing we focused on the top end workstation processors from Intel and AMD — the 56-core Intel Xeon w9-3495X and 64-core AMD Ryzen Threadripper Pro 5995WX. We also tested the dual socket 60-core Intel Xeon Platinum 8490H.

You’ll find details of our test machines below. However, it should be noted that both Lenovo workstations were preproduction units, so they may be slightly different to the final shipping machines. Performance, for example, may increase with BIOS updates, so our test results should not be treated as gospel.


Lenovo ThinkStation P7

  • Intel Xeon w9-3495X CPU (56-cores) (1.9 GHz base, 4.80 GHz Turbo Boost 3.0)
  • 256 GB (8 x 32 GB) DDR5 4,800MHz memory
  • 4 x Nvidia RTX A4000 GPU (16 GB)
  • 2 TB Samsung PM9A1 SSD
  • Microsoft Windows 11 Pro for workstations
  • (read our review)

Lenovo ThinkStation PX

  • 2 x Intel Xeon Platinum 8490H CPUs (60-cores) (1.9 GHz base, 3.5 GHz Max Turbo)
  • 256 GB (16 x 16 GB) DDR5
  • 4,800MHz memory
  • Nvidia RTX 6000 Ada Generation GPU (48 GB)
  • 2 TB Samsung PM9A1 SSD
  • Microsoft Windows 11 Pro for workstations
  • (read our review)

Scan 3XS GWP-ME A1128T

  • AMD Ryzen Threadripper Pro 5995WX processor (64-cores) (2.7 GHz base, 4.5 GHz boost)
  • 256 GB (8 x 32GB) Samsung ECC Registered DDR4 3200MHz memory
  • Nvidia RTX 6000 Ada Generation GPU (48 GB)
  • 2TB Samsung 990 Pro NVMe PCIe 4.0 SSD
  • Microsoft Windows 11 Pro
  • (read our review)

Power hungry

To put it bluntly, Intel’s ‘Sapphire Rapids’ processors are very power hungry. Both the Intel Xeon w9-3495X and Intel Xeon Platinum 8490H processors have a base power of 350W. But this is only part of the story.

When rendering in Cinebench, for example, we observed 530W at the socket with the ThinkStation P7 and 1,000W at the socket with the ThinkStation PX. Even when rendering with a single core, the Lenovo ThinkStation P7 drew a substantial 305W.

That’s not to say that the Threadripper Pro 5995WX is that much better. With a default TDP of 280W, the Scan 3XS GWP-ME A1128T workstation still drew 474W at the socket when rendering in Cinebench with all 64-cores.

Finally, it’s important to note that all our tests were done with the ‘ultimate performance’ Windows power plan and power draw may be different with future BIOS updates.


On test – Sapphire Rapids vs Threadripper Pro

We tested all three workstations with a range of real-world applications used in AEC and product development. We also compared performance figures from Intel’s and AMD’s ‘consumer’ processors, including 12th Gen Intel Core (Core i9- 12900K), 13th Gen Intel Core (Core i9- 13900K), and ‘Zen 4’ AMD Ryzen 7000 Series (AMD Ryzen 7950X), although we did not have a data for all our benchmarks.


Computer Aided Design

CAD isn’t a key target workflow for Intel ‘Sapphire Rapids’ or AMD Ryzen Threadripper Pro. In fact architects, engineers and designers that only use bread-and-butter design tools like Solidworks, Inventor and Revit, will almost certainly be better served by 12th or 13th Gen Intel Core processors or AMD Ryzen 7000 (read our comparison article).

Intel and AMD’s entrylevel CPU families generally have fewer cores and less memory bandwidth, but higher clock speeds and higher Instructions Per Clock (IPC), which are important for these largely single threaded applications.

But these days, CAD is often just one of many tools used by architects, engineers and designers, some of which do benefit from having more cores or higher memory bandwidth. So, it’s important to understand how ‘Sapphire Rapids’ performs in CAD.

We used Solidworks 2022 as our yardstick, a mechanical CAD application that is largely single threaded or lightly threaded, so only uses a few CPU cores.

As expected, the Intel Core i9-12900K, Intel Core i9-13900K and AMD Ryzen 7950X had a clear lead. With fewer cores, higher turbo frequencies, and (apart from the Core i9-12900K) better IPC, Intel and AMD’s high-end workstation processors simply can’t keep up.

The Xeon w9-3495X did show a small but significant lead over the Threadripper Pro 5995WX in the rebuild, convert and simulate tests. But the Xeon w9-3495X didn’t have things all its own way, lagging behind in the mass properties and boolean operations tests.

To get an idea of pure single threaded performance, albeit through a synthetic rendering test, we also used the Cinebench ST benchmark. Here the Xeon w9-3495X had a clear lead of 22% over the Threadripper Pro 5995WX. Interestingly, despite its significantly lower turbo frequency, the Intel Xeon Platinum 8490H wasn’t that far behind the AMD processor.


Reality modelling

Reality modelling is becoming much more prevalent in the AEC sector. Agisoft Metashape 1.73 is a photogrammetry tool that generates a mesh from multiple hires photos. It is multi-threaded, but uses multiple CPU cores in fits and starts. It also uses some GPU processing, but to a much lesser extent.

We tested using a benchmark from specialist US workstation manufacturer Puget Systems. The Threadripper Pro 5995WX just about edged out the Xeon w9-3495X in the smaller Rock model test but was 13% faster in the larger school map test. Interestingly, the Xeon Platinum 8490H was way off the pace. We wonder if the software spreads the load across both CPUs but is not optimised for this. It’s hard to explain this by the lower frequency alone.

Point cloud processing software, Leica Cyclone Register 360, assigns threads according to the amount of system memory. On a machine with 64 GB it will run on five threads and on one with 128 GB or more it will run on six.

The Threadripper Pro 5995WX was 10% faster than the Xeon w9-3495X when registering our 99 GB dataset. Both CPUs lagged behind AMD’s and Intel’s consumer processors. Even though those test machines only had 64 GB of memory, so only ran on 5 threads, their higher frequencies and IPC gave them the lead.


Rendering

Ray trace rendering is highly scalable. Roughly speaking, double the number of CPU cores to half the render time (if frequencies are maintained). The Threadripper Pro 5995WX significantly outperformed the Xeon w9-3495X in KeyShot and V-Ray, two of the most popular tools for design visualisation, and in Cinebench 23, the benchmark for Cinema4D. The Threadripper Pro 5995WX was 35% faster in Keyshot, 27% faster in V-Ray and 20% faster in Cinebench. This is a considerable lead.

But the advantage that AMD’s top-end workstation processor holds over the Xeon w9-3495X is not just down to it having 8 more cores. The relative energy efficiency of both processors and, therefore, the allcore frequencies they can maintain, has a major impact on performance.

In Cinebench, for example, the Threadripper Pro 5995WX maintained 3.05 GHz on all 64-cores while the Xeon w9-3495X went down to 2.54 GHz. The Xeon w9-3495X’s relationship between power, frequency and threads can be seen in more detail in the charts below.

Meanwhile, the dual Intel Xeon Platinum 8490H beat both single socket processors considerably. But with 120 cores and 240 threads to play with this came as little surprise.


Engineering simulation

Engineering simulation includes Finite Element Analysis (FEA) and Computational Fluid Dynamics (CFD). FEA can help predict how a product reacts to real-world forces or temperatures. CFD can be used to optimise aerodynamics in cars or predict the impact of wind on buildings. Both types of software are extremely demanding computationally.

There are many different types of ‘solvers’ used in FEA and CFD and each behaves differently, as do different datasets.

In general, CFD scales very well and studies should solve much quicker with more CPU cores. Importantly, CFD can also benefit greatly from memory bandwidth, as each CPU core can be fed data quicker. This is one area in which ‘Sapphire Rapids’ can outperform Threadripper Pro. Both have 8-channel memory, but ‘Sapphire Rapids’ uses faster DDR5 4,800MHz whereas Threadripper Pro uses DDR4 3,200MHz.

For our testing we used three select workloads from the SPECworkstation 3.1 benchmark. This includes two CFD benchmarks (Rodinia, which represents compressible flow, and WPCcfd, which models combustion and turbulence) and one FEA benchmark (CalculiX, which models a jet engine turbine’s internal temperature).

In Rodinia, the Xeon w9-3495X outperformed the Threadripper Pro 5995WX by a whopping 101%. In WPCcfd, the lead was smaller but, at 13%, still significant. Performance of both processors were dwarfed by the dual Intel Xeon Platinum 8490H.

Both Intel processors fared much worse in the Calculix (FEA) test, where the Threadripper Pro 5995WX took a substantial lead.


Memory bandwidth

In addition to cores, memory bandwidth is one of the main differentiators between workstation processors and their consumer counterparts.

This is governed largely by the number of memory channels each processor supports, but also by the type of memory.

Memory channels act as pathways between the system memory and the CPU. The more channels a CPU has, the faster data can be delivered.

13th Gen Intel Core and the AMD Ryzen 7000 Series have two memory channels, while the Intel Xeon W-2400 Series has four, and Intel Xeon W-3400 Series, 4th Generation Intel Xeon Scalable and Threadripper Pro 5000 Series all have eight. To get the full memory bandwidth, all memory channels must be populated with memory modules, as was the case with all our test machines.

As mentioned earlier, ‘Sapphire Rapids’ Xeons have an advantage over the AMD Ryzen Threadripper 5000 Series as they support faster memory – DDR5 4,800MHz compared to DDR4 3,200MHz.

A quick run through the SiSoft Sandra benchmark shows the comparative memory bandwidth one can expect. The Threadripper Pro 5995WX recorded 139.27 GB/sec, while the Intel Xeon w9- 3495X pulled 184.64 GB/sec and the dual Intel Xeon Platinum 8490H went up to 325.6 GB/sec. These figures help explain why Sapphire Rapids does so well in our memory intensive CFD benchmarks.

To see how memory bandwidth impacts performance in different workflows, we tested the Xeon w9-3495X with a variety of different memory configurations, from 1-channel with a single 32 GB DIMM, all the way up to 8-channels with 8 x 32 GB DIMMs. Interestingly, even with 6-channels, the Xeon w9-3495X edged out the Threadripper Pro 5995WX in memory bandwidth, delivering 141.21 GB/sec in SiSoft Sandra.

As most of our benchmarks fit into 32 GB of memory, the fact that we reduced the capacity should have minimal impact on results, although it can’t be ignored altogether. The exception is our Leica Cyclone Register 360 test, which adjusts the number of cores used in relation to system memory. This is why performance drops off massively with 32 GB.

As you can see from the charts on page WS9, memory bandwidth in the WPCcfd benchmark has a massive impact on performance. Interestingly, even with 6-channels filled, the Intel Xeon w9-3495X outperforms the AMD Ryzen Threadripper Pro 5995WX.

Another workflow massively influenced by memory bandwidth is recompiling shaders in Unreal Engine 4.26 which uses all available cores. However, where Threadripper Pro 5995WX loses out in GB/sec it makes up for in cores and all-core frequency, as it still managed to beat the Xeon w9-3495X in our automotive benchmark.

Performance in CAD (Solidworks), ray trace rendering (V-Ray) and reality modelling (Leica Cyclone Register 360 and Agisoft MetaShape Professional 1.73) appears to be virtually unaffected by memory bandwidth. There are a couple of caveats in Solidworks. In the simulation test, performance dropped a little when going from 4-channels to 1-channel. In boolean operations, 1-channel memory actually delivered marginally better results.

Conclusion

The importance to Intel of ‘Sapphire Rapids’ Xeon W-2400 and Xeon W-3400 being a success cannot be overstated. For the last few years AMD has had little in the way of competition in workflows that benefit from many cores or high memory bandwidth. Intel will have certainly felt the impact of Threadripper Pro.

From our tests, however, Sapphire Rapids is not going to be the Threadripper Pro 5000 WX-Series killer we thought it might be, at least in the broader product development sector.

In ray trace rendering, the 64-core Threadripper Pro 5995X still has a considerable lead over the 56-core Xeon w9- 3495X. And while Intel may possibly win out at certain price points, simply because it has so many different models across its Xeon W-2400 and W-3400 families, we certainly don’t expect viz specialists to move to ‘Sapphire Rapids’ en masse. Plus, as you move down the range, it will face more competition from 13th Gen Intel Core.

But ‘Sapphire Rapids’ does have some big plusses. In single threaded workflows it appears to have a lead over Threadripper Pro, which could make a real difference in some CAD/BIM applications. Better single threaded performance should also boost 3D frame rates in CPU-limited applications.

We found the biggest potential benefit for ‘Sapphire Rapids’ to come from engineering simulation, specifically CFD. Our tests show that ‘Sapphire Rapids’ can deliver a massive performance boost, largely thanks to its superior memory bandwidth. While solvers and datasets vary, serious users of tools from Ansys, Altair and others should certainly explore what the Xeon W-3400 and 4th Gen Intel Xeon Scalable processors can do for them. Extremely complex simulations can take hours, even days to run. Cutting this time in half could deliver monumental benefits to a project.

All of this is exciting, but one can’t help but keep one eye on the future. AMD is expected to launch its next generation ‘Zen 4’ Threadripper Pro CPUs later this year. And, if rumours of 96-cores and 12-channel memory (DDR5) become a reality, then any lead Intel might have could be short lived.


Overclocking ‘Sapphire Rapids’- Pump up the power

Intel’s single socket ‘Sapphire Rapids’ workstation processors can be overclocked. This requires more power to be pumped into the CPU, which, of course, means more heat and therefore liquid cooling. While none of the major OEMs get involved with this, UK firm Armari is an expert.

For the Intel Xeon w9-3495X, Armari has developed a custom water-cooling solution for its 2UR56SR Node, a rack workstation available through its ‘Ripper Rentals’ cloud workstation service. It allows the CPU to support up to 500W on all-core boost — a full 150W above its default TDP.

Armari also has a similar offering for the Threadripper Pro 5995WX, the 2UR64TP-RW Node.

We put both machines through their paces in Cinebench R23.

The Intel Xeon w9- 3495X machine hit 2.88 GHz on all cores, 0.3 GHz faster than the air-cooled Lenovo ThinkStation P7.

This delivered a score of 69,811, equating to a significant 19% performance uplift.

The Threadripper Pro 5995WX machine hit 3.35 GHz on all cores, 0.3 GHz faster than the Scan 3XS GWP-ME A1128T. This delivered a score of 76,117, corresponding to an 8% performance uplift.

Armari also offers an overclocked desktop Threadripper Pro workstation. Read our review here.


This article is part of DEVELOP3D’s Workstation Special Report

Scroll down to read and subscribe here

Featuring