AMD has launched Threadripper Pro, a CPU designed specifically for enterprise workstations. More importantly, it has partnered with Lenovo to launch the first Threadripper Pro workstation, the ThinkStation P620.
Over the last 12 months AMD has emerged as a serious competitor to Intel. We’ve seen great price / performance from the consumer focused 3rd Gen AMD Ryzen, but it’s with 3rd Gen AMD Ryzen Threadripper that AMD has really turned up the heat.
With fast clock speeds, better per-GHz performance, and up to 64-cores in a single socket CPU, Intel simply can’t compete when it comes to highly-threaded applications like ray trace rendering. And this is why Threadripper has been getting so much attention from users of design viz tools like V-Ray and KeyShot.
Despite this market leading performance, AMD’s impact in the workstation market has been minimal.
Smaller workstation manufacturers like Armari, Boxx, Scan and Workstation Specialists have done well with their AMD-based machines, but without the big three on board – that is Dell, HP and Lenovo – AMD was never going to take a significant slice of a market that has been dominated by Intel for so long.
But this looks set to change. AMD has launched Threadripper Pro, a CPU designed specifically for enterprise workstations. More importantly, it has partnered with Lenovo to launch the first Threadripper Pro workstation, the ThinkStation P620.
This is massive news for the industry. It’s the first time in nearly 15 years that a major OEM has released a workstation with an AMD CPU.
When HP launched the AMD Opteron-based HP xw9400 in 2006, the iPhone didn’t exist, skinny jeans were only for goths and Kanye West had just started to express his concerns about ‘gold diggers’.
What is Threadripper Pro?
In simple terms, AMD Ryzen Threadripper is to AMD Ryzen Threadripper Pro, as Intel Core is to Intel Xeon.
Both AMD CPUs share the same core silicon, but there are several features that set the workstation CPU apart from its consumer or ‘enthusiast’ focused sibling. These include more memory channels (8 vs 4), higher memory capacity (2TB vs 256GB) and additional PCIe Gen4 lanes (128 vs 64).
Memory is arguably the biggest differentiator, and this will be especially important in memory intensive applications like Computational Fluid Dynamics (CFD) or Finite Element Analysis (FEA), which are both used heavily in the automotive and aerospace industries.
Some of the more complex fluid flow or multi-physics simulations can literally eat up memory and by offering more capacity and feeding data into the CPU much quicker via 8-channels, it should have a big impact on performance.
The increase in memory capacity has been enabled through support for RDIMM and LRDIMM modules. Error Correcting Code (ECC) is also supported, which is important for those running simulations over several hours or even days – and want to minimise the risk of crashes. Consumer Threadripper did support ECC memory, but not on all motherboards.
Threadripper Pro also covers a wider range of cores and 12, 16, 32 and 64-core models will be available at launch. In comparison, Threadripper comes in 24, 32 or 64-core variants, while consumer CPUs with 16-cores or less come under the Ryzen brand.
Clock speeds for the 32-core Threadripper Pro 3975WX and 64-core 3995WX are slightly lower than consumer Threadrippers with equivalent core counts, both in terms of base and boost frequency. According to AMD, this is because Threadripper Pro offers more functionality within the same power budget – specifically referring to memory bandwidth, capacity and the number of PCIe Gen4 lanes.
While the slightly lower frequency will have an impact on performance in most applications, the benefits from increased memory bandwidth and capacity in memory intensive applications like CFD could far outweigh the loss of a couple of hundred MHz.
Consumer doesn’t mean faster frequencies across the board. The 12-core Threadripper Pro 3945WX and 16-core 3955WX actually have higher base clocks than the equivalent Ryzen CPUs, even though the boost speed is lower. This is because these CPUs have much a higher TDP than the equivalent Ryzens, so more power can be pumped in.
All Threadripper Pro CPUs are rated at 280W, while the 16-core Ryzen 9 3950X has a default TDP of 105W.
From a security and manageability perspective, Threadripper Pro comes with several features that will be really important to some enterprise customers.
For example, AMD Memory Guard allows the contents of system memory to be fully encrypted, adding an additional layer of security. This is designed to reduce the threat of a physical memory attack, even if a workstation is left in standby mode.
As you would expect, there is a small overhead when using encryption but it’s only a few percent says AMD, and for those who need to protect confidential IP, it’s probably a small price to pay.
Threadripper Pro also features AMD Pro Manageability, which includes a set of features designed to speed and simplify deployment imaging and manageability within an enterprise IT environment, making it easier to support remote workers. AMD Secure Boot offers boot protection to help prevent unauthorised software and malware from taking over critical system functions.
The launch of Threadripper Pro and, more importantly the fact that one of the big three workstation manufacturers has taken it on, will certainly help open doors for AMD.
Even though 3rd Gen Threadripper was (and still is) an exceptional CPU, some enterprise customers simply wouldn’t touch it as they only buy from major OEMs.
Regional manufacturers like Scan, Armari and Workstation Specialists, simply can’t compete on a global stage when it comes to Independent Software Vendor (ISV) certification, support or manageability.
All about the software
In the run up to the launch of Threadripper Pro, AMD has been working closely with over 60+ different ISVs. This is not only to get software applications certified, which is essential for some enterprise customers, but to get more performance out of the CPU.
AMD is addressing this on two fronts. First, to help ISVs take better advantage of the vast number of cores available in a Threadripper Pro CPU and second to help AMD processors run faster on software that relies on optimised Intel code.
Some tools already run extremely well or can be optimised very easily. Ray trace rendering is a prime example and KeyShot, V-Ray, Corona Renderer, Cinema 4D and many others already show excellent performance and scaling as core counts increase.
From a development perspective, Chaos Group only had to make a slight tweak to its V-Ray code in order to take full advantage of Threadripper’s 64-cores and 128 threads.
Simulation software represents a huge opportunity for Threadripper Pro, and Lenovo told us that some firms are now re-evaluating their workflows.
Hardcore simulation software often runs on a server or cluster, so studies are typically prepared on the workstation then added to a queue. Having instant access to huge levels of performance on the desktop could have a huge positive impact on design, test, iterate workflows, bringing simulation up front in the design process.
For some simulation software, optimisation can be quite involved, but AMD is not doing this from a standing start. It already has good relationships with many of the leading simulation ISVs through its server team, which are responsible for the AMD EPYC CPU.
In fact, as the system architectures of 2nd Gen EPYC and Threadripper Pro are very similar and both have 8-channel memory, any benefits seen in EPYC should automatically carry through to the desktop. Users of STAR-CCM+, Abaqus/Explicit, Openfoam for CFD, LS-DYNA, Ansys Fluent, and Ansys CFX should find some useful performance information in this AMD EPYC community page.
Of course, even relatively small amounts of development work can still take time. For example, AMD told DEVELOP3D that any software that has Intel Math Kernel Library (Intel MKL) integrated into the code needs a small tweak to get the AMD CPU to run math functions at full AVX2 speeds. AMD is working with the ISVs to make this step unnecessary, but it’s still a manual process for most applications.
Simulation software developers are very familiar with the challenge of squeezing more performance out of their software, but some software is simply easier to optimise than others.
In the past we’ve found performance in Ansys Mechanical to peak at 12 or 16 cores on a dual 14-core Intel Xeon workstation, so it will be interesting to see how software like this runs on Threadripper Pro and whether there are benefits to running everything on a single socket or from the increased memory bandwidth.
Certainly, firms need to understand exactly how their software works before investing in a 64-core CPU, and not presume that more cores always mean more performance.
Software licensing costs also need to be considered. In the world of simulation, it’s not just about ultimate performance. Some of the licensing models around CFD and FEA software are still quite archaic, with some licensed by socket or number of CPU cores. With this in mind, there’s simply no point in paying for cores that won’t increase performance, or only by a little.
It’s also important to consider performance per core, which could make the 12 and 16-core models (which both have significantly higher all core frequencies than the 32 and 64-core models) much more attractive from a combined hardware and software price / performance perspective.
Beyond rendering and simulation there are many other areas of design and engineering that could benefit from Threadripper Pro, including CAM, point cloud processing or generative design.
In some of these applications the potential for optimising for more cores is huge. Point cloud processing software Leica Cyclone Register 360, for example, is only optimised to run on 6-threads, as explored in this AEC Magazine article, primarily because the software has only ever been tuned for a mainstream desktop workstation. Now, with 64-cores and 2TB of memory to play with, we’re excited to see what Threadripper Pro could bring to the table.
Shake up at the high-end
In the past, customers with workflows that could benefit from lots of cores have had to go for a dual socket workstation. At the top end, this meant one with two 28-core Intel Xeon Platinum 8280 processors. With Threadripper Pro, AMD can deliver even more cores in a single socket 64-core CPU.
And Threadripper Pro looks to be winning out on performance. AMD benchmark figures comparing the Threadripper Pro 3995WX to a dual Intel Xeon Platinum 8280 show superior performance in ray trace rendering software – 36% faster in Keyshot and 12% faster in V-Ray CPU.
But it’s not just in multi-threaded applications that AMD is shining. AMD reckons Threadripper Pro 3995X is 22% faster in the single threaded Cinema4D benchmark and also delivers faster 3D graphics performance when using the same Nvidia Quadro RTX GPU on both AMD and Intel systems.
The fact that AMD Threadripper Pro can pack so many cores in a single socket could also benefit memory hungry applications like CFD and FEA, as AMD’s Chris Hall explains. “There’s always an overhead associated with a non-uniform memory access in a dual socket system. So, either the software has to copy the data or there is a higher latency time to access data from memory, on the other socket on the motherboard. There may be some niche cases where you could see a reason to split your 64 cores into two 32 [cores] on each socket. But in general, it’s better to have a single socket with the eight channels that we have, and the 64 cores, if you want to go that high.”
Finally, there’s cost to consider. A pair of Intel Xeon Platinum 8280s will set you back a whopping $20,000. And while AMD has yet to release pricing of Threadripper Pro, we expect it to only come with a relatively small premium over Threadripper, which costs around $4,000 for the 64-core Threadripper 3990X.
Threadripper Pro: A CPU for everyone?
With the spotlight shining on the 64-core Threadripper Pro 3995X, it’s easy to forget that AMD’s new CPU is available in in 12, 16 and 32 core models.
None of the Threadripper Pro CPUs are ideally suited to CAD alone, which are generally single threaded applications, but the 12-core Threadripper Pro 3945X and 16-core 3955X should offer an interesting proposition for what Lenovo describes as ‘CAD plus’. These are architects, engineers and designers who primarily use 3D CAD or BIM software but also rely on a secondary tool like ray trace rendering, CAM, simulation, generative design or point cloud processing, all of which are generally multi-threaded applications.
Here, we expect the impressive all core base frequencies of the Threadripper Pro 3945X (4.0GHz) and 3955X (3.9GHz) to give AMD a performance advantage in multi-threaded applications over the equivalent 12-core Intel Xeon W-3235 and 16-core Xeon W-3245.
In fact, Lenovo has shared results from the multi-threaded Cinebench rendering benchmark, that shows the 16-core Threadripper Pro 3955X actually beating the 18-core Intel Xeon W-2295.
Lenovo ThinkStation P620
When the Lenovo ThinkStation P620 ships this Autumn it will be the first Threadripper Pro workstation. And it will be the only one for some time too, as Lenovo has an exclusive agreement with AMD for six months, starting today. Even if HP or Dell do decide to take on the processor, we wouldn’t expect to see those machines materialise until mid 2021 when AMD will likely do a Threadripper Pro refresh.
While the ThinkStation P620 is a new product, it has not been designed completely from scratch. It shares the same chassis as the single socket Intel Xeon W-based ThinkStation P520, although Lenovo has enhanced the cooling to accommodate the 280W Threadripper Pro CPU. The main chassis fans remain the same, but the CPU features a more substantial heatsink, custom designed by Lenovo and AMD, with two built-in fans.
Importantly, Lenovo has not opted for liquid cooling. In the enterprise space, stability and serviceability are of paramount performance. And while custom liquid cooling solutions, such as the one used by Armari in its Magnetar X64T-G3 FWL workstation, allow Threadripper to hit 4.0 GHz on all 64-cores, most enterprise IT departments prefer to keep things simple.
We don’t yet know what all-core speeds we can expect to see in the 64-core ThinkStation P620, but it almost certainly will not be 4.0GHz. To hit that frequency, Armari needs between 550W and 800W of power. With Threadripper Pro, the ThinkStation P620 is locked at 280W.
But this is a first generation product. While it seems unlikely that Lenovo will turn to liquid cooling in the future, it doesn’t rule it out completely. Lenovo admits it will depend a little on what the competition does.
The ThinkStation P620 chassis is a fairly compact 33 litres. It’s perfect for mainstream users but it does mean Lenovo is not able to take full advantage of the Threadripper Pro architecture. With 8 DIMM slots, the machine is limited to 1TB of memory, although the prohibitive cost of 128GB modules means 512GB is a more realistic maximum capacity when it ships this Autumn.
The P620 can support two double height GPUs up to the Nvidia Quadro RTX 8000 or four single height GPUs up to the Quadro RTX 4000. Hosting four RTX 4000s is something that can’t be done in an Intel box. It would be an interesting proposition for GPU rendering or even virtualisation with GPU passthrough, as the P620 can be rack mounted in a data centre.
Nvidia RTX GPUs are currently PCIe Gen 3, so you’ll need to wait for the next generation to take full advantage of PCIe Gen 4, although doubling the bandwidth won’t make a difference in all applications.
To take full advantage of Threadripper Pro’s 128 PCIe Gen 4 lanes and offer support for four double height GPUs, Lenovo would need a bigger box, but as it stands the ThinkStation P620 should still satisfy the requirements of most users. Also, most firms investing in a workstation with so many CPU cores won’t necessarily have a need for such an array of high-end GPUs.
The most immediate benefit of PCIe Gen 4 will come from storage. Compared to Intel-based workstations which remain on PCIe Gen 3, you’ll get significantly faster sequential read/write performance in the ThinkStation P620. This can give a real benefit when working with large datasets in post-production, simulation or point cloud processing.
The motherboard can host two Samsung PM9A1 PCIe Gen 4 M.2 NVMe SSDs, which can be configured in a RAID array for performance or redundancy. There’s an optional add in board which can host four M.2s, but this is PCIe Gen 3 so you miss out on the bandwidth benefits. For those just interested in capacity, the ThinkStation P620 can host up to four 3.5-inch Hard Disk Drives (HDDs).
There have been some additional tweaks to the P520 chassis. There’s 10 Gigabit on board rather than the standard 1 Gigabit, which will be useful for shifting large design viz, simulation or point cloud datasets quickly across the network. There are also two USB Type C ports (the P520 only supports USB Type A).
Importantly, the ThinkStation P620 will support Linux (Ubuntu and Red Hat Enterprise) as well as Windows 10 Pro. While most software used in product development and Architecture, Engineering and Construction (AEC) runs on Windows, Linux is widely used in simulation, so this should help Lenovo break the Intel monopoly in this space.
Lenovo only introduced Linux support to all of its ThinkStation and ThinkPad products in June 2020, so the timing of this is perfect.
Threadripper Pro is an exceedingly important release for AMD. It’s not just a new processor, it’s a gateway in the highly lucrative workstation market which has been dominated by Intel for so long.
Bringing a tier-one OEM on-board was essential in order for AMD to get its technology into the hands of enterprise customers. After all, an engine without a car can’t go anywhere fast.
For Lenovo, it’s a very smart move. Yes, the ThinkStation P620 promises impressive performance, but bringing Threadripper Pro into the fold is also about satisfying the diverse requirements of enterprise customers.
For years the tier one manufacturers have held niche machines in their product ranges simply to secure big deals with enterprise customers. The 17-inch mobile workstation is a case in point. It’s never sold in big numbers, but there’s always a handful of users that want one.
The 64-core Threadripper Pro will certainly grab all the headlines, and it offers an excellent proposition for users of ray trace rendering or highly threaded simulation software.
However, it’s the 12- and 16-core models that will likely get most attention from manufacturing and AEC firms, offering good single threaded performance and excellent multi-threaded performance at what we expect to be an attractive price point.
While this will appeal to so-called ‘CAD plus’ users, Threadripper Pro doesn’t currently pose a real threat to Intel at the entry-level, where the majority of architects and engineers just need a high frequency CPU with a few cores to run CAD or BIM software. But, if AMD can improve its single threaded performance, who knows where Threadripper Pro (or Ryzen Pro) might end up.
Lenovo is certainly in for the long haul. At the moment the ThinkStation P620 covers the middle ground but there’s certainly room for a higher-end machine, with more memory and more GPUs.
Lenovo hinted that it will expand its Threadripper portfolio over time, just as it has done with Intel.
In bringing AMD on board, Lenovo didn’t just need to think about technology. The move is sure to have had some impact on its relationship with Intel. Now that Lenovo has taken those first big steps, it will be very interesting to see if HP and Dell follow suit.
2021 is going to be a very important year for AMD in the workstation market. If Threadripper Pro is received well by enterprise customers and taken on by other OEMs, then Intel will surely start to get worried.