We’re winding-down coverage of Vega, at this point, but we’ve got a couple more curiosities to explore. This content piece looks at a mix of clock scalability for Vega across a few key clocks (for core and HBM2), and hopes to constrain for a CU difference, to some extent. We obviously can’t fully control down to the shader level (as CUs carry more than just shaders), but we can get close to it. Note that the video content does generally refer to the V56 & V64 difference as one of shaders, but more than shaders are contained in the CUs.
In our initial AMD shader comparison between Vega 56 and Vega 64, we saw nearly identical performance between the cards when clock-matched to roughly 1580~1590MHz core and 945MHz HBM2. We’re now exploring performance across a range of frequency settings, from ~1400MHz core to ~1660MHz core, and from 800MHz HBM2 to ~1050MHz HBM2.
This content piece was originally written and filmed about ten days ago, ready to go live, but we then decided to put a hold on the content and update it. Against initial plans, we ended up flashing V64 VBIOS onto the V56 to give us more voltage headroom for HBM2 clocks, allowing us to get up to 1020MHz easily on V56. There might be room in there for a bit more of an OC, but 1020MHz proved stable on both our V64 and V56 cards, making it easy to test the two comparatively.
We’ve talked about this in the past, but it’s worth reviving: The reason or keeping motherboard consistency during CPU testing is the inherent variance, particularly when running auto settings. Auto voltage depends on a lookup table that’s built on a per-EFI basis for the motherboards, which means auto VIDs vary between not only motherboard vendors, but between EFI revisions. As voltage changes, power consumption changes – the two are directly related – and so too the wattage changes. As a function of volts and amps, watts consumed by the CPU will increase on motherboards that push more volts to the CPU, regardless of whether the CPU needs that voltage to be stable.
We previously found that Gigabyte’s Gaming 7 Z270 motherboard supplied way too much voltage to the 7700K when in auto settings, something that the company later resolved. The resolution was good enough that we now use the Gaming 7 Z270 for all of our GPU tests, following the fix of auto voltages that were too high.
Today, we’re looking at the impact of motherboards on Intel i9-7960X thermals primarily, though the 7980XE makes some appearances in our liquid metal testing. Unless otherwise noted, a Kraken X62 was used at max fan + pump RPMs.
Running through the entire Skylake X lineup with TIM vs. liquid metal benchmarking means we’ve picked-up some very product-specific experience. Skylake X has a unique substrate composition wherein the upper substrate houses the silicon and some SMDs, with the lower substrate hosting the pads and some traces. This makes delidding unique as well, made easier with Der8auer’s Delide DieMate X (available in the US soon). This tutorial shows how to delid Intel Skylake X CPUs using the DieMate X, then how to apply liquid metal. We won't be covering re-sealing today.
Still, given the $1000-$2000 cost with these CPUs, an error is an expensive one. We’ve put together a tutorial on the delid and liquid metal application process.
Disclaimer: This is done entirely at your own risk. You assume all responsibility for any damage done to CPUs. We will do our best to detail this process so that you can safely follow our steps, and following carefully will minimize risk. Ultimately, the risk exists primarily in (1) applying too much force or failing to level the CPU, both easily solved, or (2) applying liquid metal in a way that shorts components.
There are many reasons that Intel may have opted for TIM with their CPUs, and given that the company hasn’t offered a statement of substance, we really have no exact idea of why different materials are selected. Using TIM could be a matter of cost – as seems to be the default assumption – and spend, it could be an undisclosed engineering challenge to do with yields (with solder), it could be for government or legal grants pertaining to environmental conscientiousness, or related to conflict-free advertisements, or any number of other things. We don’t know. What we do know, and what we can test, is the efficacy of the TIM as opposed to alternatives. Intel’s statement pertaining to usage of TIM on HEDT (or any) CPUs effectively paraphrases as “as this relates to manufacturing process, we do not discuss it.” Intel sees this as a proprietary process, and so the subject matter is sensitive to share.
With an i7-7700K, TIM is perhaps more defensible – it’s certainly cheaper, and that’s a cheaper part. Once we start looking at the 7900X and other CPUs of a similar class, the ability to argue in favor of Dow Corning’s TIM weakens. To the credit of both Intel and Dow Corning, the TIM selected is highly durable to thermal cycling – it’ll last a long time, won’t need replacement, and shouldn’t exhibit any serious cracking or aging issues in any meaningful amount of time. The usable life of the platform will expire prior to the CPU’s operability, in essence.
But that doesn’t mean there aren’t better solutions. Intel has used solder before – there’s precedent for it – and certainly there exist thermal solutions with greater transfer capabilities than what’s used on most of Intel’s CPUs.
Today's video showed some of the process of delidding the i9-7900X -- again, following our Computex delid -- and learning how to use liquid metal. It's a first step, and one that we can learn from. The process has already been applied toward dozens of benchmarks, the charts for which are in the creation stage right now. We'll be working on the 7900X thermal and power content over the weekend, leading to a much greater content piece thereafter. It'll all be focused on thermals and power.
As for the 7900X, the delid was fairly straight forward: We used Der8auer's same Delid DieMate tool that we used at Computex, but now with updated hardware. A few notes on this: After the first delid, we learned that the "clamp" (pressing vertically) is meant to reseal and hold the IHS + substrate still. It is not needed for the actual delid process, so that's one of the newly learned aspects of this. The biggest point of education was the liquid metal application process, as LM gets everywhere and spreads sufficiently without anything close to the size of 'blob' you'd use for TIM.
Taking apart EVGA's GTX 1080 Ti FTW3 Hybrid isn't too different from the process for all the company's other cards: Two types of Phillips head screws are used in abundance for the backplate, the removal of which effectively dismantles the entire card. Wider-thread screws are used for the shroud, with thinner screws used for areas where the backplate is secured to front-side heatsinks (rather than the plastic shroud).
That's what we did when we got back from our PAX trip -- we dismantled the FTW3 Hybrid. We don't have any immediate plans to review this card, particularly since its conclusions -- aside from thermals -- will be the same as our FTW3 review, but we wanted to at least have a look at the design.
Before PAX Prime, we took apart the Logitech G903 mouse and wireless charging station, known as “Powerplay.” The G903 mouse can socket a “Powerplay module” into the weight slot, acting as one of two coils to engage the magnetic resonance charging built into the underlying powerplay mat. Magnetic resonance and inductive charging have been around since Nikola Tesla was alive, so it’s not new technology – but hasn’t been deployed in a mainstream peripheral implementation. Laptops have attempted various versions of inductive charging in the past (to varying degrees of success), and phones now do “Qi” charging, but a mouse is one of the most sensible applications. It’s also far lower power consumption than something like a laptop, and so doesn’t suffer as much for the inefficiencies inherent to wireless charging.
AMD’s architecture hasn’t generally shown a large gain from increasing CU count between top-tier and second-to-top cards. The Fury and Fury X, for instance, could be made to match with an overclock on the lower-tiered card. Additional gains on the higher-tiered card often amount from the increased power limit and clock, not from a straight shader increase. We’re putting that knowledge to the test on Vega architecture, equalizing the Vega 56 & Vega 64 clocks (and 945MHz HBM2 clocks) to determine how much of a difference emerges from the 4096 shaders on V64 to 3584 shaders on V56. Purely counting shaders, that’s a 14% increase to V64, but like most performance metrics, that won’t result in a linear performance increase.
We were able to crush Vega 64’s performance with our heavily modded Vega 56 card, using powerplay tables and liquid to jump to 1742MHz clock speeds. That's with modding, though, and isn't out-of-box performance -- it also doesn't give us any indication as to shader differences. Going less crazy about overclocking and limiting clocks to matched speeds, we can reveal the shader count difference.
Thermals and noise to align with final launch.
There were a lot of challenges going into this build: A lack of magnetism, a lack of lighting on the show floor of a convention center, and some surprises in between. Cooler Master allowed us to build in the brand-new Cosmos C700P case – a modular chassis with an invertible or rotatable motherboard tray – live at PAX West. After being faced with some challenges along the way, we recruited Cooler Master’s Wei Yang to turn it into a collaborative team build. It was one of the most fun builds we’ve done in a while, and the pressure of time meant that we were both taking turns dropping screws and reworking our aspects of the build. This was a real PC build. There were unplanned changes, parts that GN hasn’t used before, and sacrifices made along the way.
All said and done, the enclosure is exceptionally easy to work within: Every single panel can be removed with relative ease, so we were able to strip-down the case to barebones for the build. Our biggest timesink was asking to invert the motherboard tray to face the other side, since that’d add some flare to the build. This process isn’t intrinsically difficult, but it does require removal of a lot of screws – after all, the entire case can be flipped, and there are a lot of structural elements there. The motherboard tray detaches by removing 4-6 screws on the back-side, followed by six screws in the rear of the case, followed by a few more screws for the shrouds. We got some help for this process, as the case is one of the first working samples of the Cosmos C700P and there’s not yet a manual for which screws have to be removed.
(The video for this one is a read-through of this article -- same content, just read to you.)
Everyone talks game about how they don’t care about power consumption. We took that comment to the extreme, using a registry hack to give Vega 56 enough extra power to kill the card, if we wanted, and a Floe 360mm CLC to keep temperatures low enough that GPU diode reporting inaccuracies emerge. “I don’t care about power consumption, I just want performance” is now met with that – 100% more power and an overclock to 1742MHz core. We've got room to do 200% power, but things would start popping at that point. The Vega 56 Hybrid mod is our most modded version of the Hybrid series to date, and leverages powerplay table registry changes to provide that additional power headroom. This is an alternative to BIOS flashing, which is limited to signed drivers (like V64 on V56, though we had issues flashing V64L onto V56). Last we attempted it, a modified BIOS did not work. Powerplay tables do, though, and mean that we can modify power target to surpass V56’s artificial power limitation.
The limitation on power provisioned to the V56 core is, we believe, fully to prevent V56 from too easily outmatching V64 in performance. The card’s BIOS won’t allow greater than 300-308W down the PCIe cables natively, even though official BIOS versions for V64 cards can support 350~360W. The VRM itself easily sustains 360W, and we’ve tested it as handling 406W without a FET popping. 400W is probably pushing what’s reasonable, but to limit V56 to ~300W, when an additional 60W is fully within the capabilities of the VRM & GPU, is a means to cap V56 performance to a point of not competing with V64.
We fixed that.
AMD’s CU scaling has never been that impacting to performance – clock speed closes most gaps with AMD hardware. Even without the extra shaders of V64, we can outperform V64’s stock performance, and we’ll soon find out how we do versus V64’s overclocked performance. That’ll have to wait until after PAX, but it’s something we’re hoping to further study.
We moderate comments on a ~24~48 hour cycle. There will be some delay after submitting a comment.