Items to Consider & Methods
Prior to diving into this headlong, there are a few important disclaimers to make:
Not all cards are the same. Ours may undervolt better or worse than yours. If yours is better, that’s fantastic – run with it, but ours couldn’t handle anything better than the 1090mv number we’re primarily publishing. If it’s worse, just increase voltage until stability is met. We’d advise slowly stepping down voltage in increments rather than just copying numbers, as it’s likely that your GPU core will respond differently than ours.
Next, note that the software is still buggy. Conflicting reports abound, but at the end of the day, WattMan and Wattool are both imperfect solutions to this problem. At present, WattMan and Wattool both have an HBM2 underclocking bug that rears its head only under certain conditions. We are not 100% positive when this bug emerges, but we think we’ve pin-pointed it to manual overclocking of all 7 DPM states and corresponding voltages on the volt-frequency curve. This seems particularly true when attempting to set all 7 to be equal or close in values. That’s not to say it’s impossible to make these tools work, we’re just pointing it out because you could potentially cut off a huge amount of performance inadvertently from an HBM2 downclock (that wasn’t user-initiated). Keep an eye on HBM2 frequency while performing these tweaks. Note also that fan RPM targets are 200RPM lower than what the device will spin-up.
Finally, for our testing, we care about constraining variables between two of our main cards. We’ve got three total configurations: Stock Auto – no changes whatsoever; Stock +50% PWR, with no voltage or DPM changes; Stock +50% PWR & undervolted. This is all done with the air card (hence “stock”), not the Hybrid mod. We’ve additionally configured our +50% PWR & +50% PWR & undervolt configurations to use a fixed fan speed of 3400RPM, which is a bit aggressive – but the point is to ensure we’re not thermally throttling. Thus, the latter two configurations are directly thermally comparable, but the Stock – Auto configuration (which runs a fan curve that’s way too limp) will not run the same fan RPM, and is thus not 100% comparable.
|GN Test Bench 2017||Name||Courtesy Of||Cost|
|Video Card||This is what we're testing||-||-|
|CPU||Intel i7-7700K 4.5GHz locked||GamersNexus||$330|
|Memory||GSkill Trident Z 3200MHz C14||Gskill||-|
|Motherboard||Gigabyte Aorus Gaming 7 Z270X||Gigabyte||$240|
|Power Supply||NZXT 1200W HALE90 V2||NZXT||$300|
|Case||Top Deck Tech Station||GamersNexus||$250|
|CPU Cooler||Asetek 570LC||Asetek||-|
Vega: Frontier Edition Undervolting Power Consumption
Starting with current draw at the PCIe cables only, the completely stock card starts off drawing about 268W, but as we approach the 400-second mark, the card starts spiking hard between 17.7A and 23A. This behavior correlates with clock throttling – which we’ll show in a moment – and is precisely why we’ve been saying that Vega: FE Air can’t hold its advertised 1600MHz boost clock out of the box. Its power limit and cooler are simply insufficient. The cooler can do it if exiting the fan profile and going to high dBAs, but this is where it sits out of box.
The next move is to get the frequency to hit 1600MHz constantly, so we increase the power target by 50% and set a fixed fan speed to solve for the thermal limit, absolving Vega of both its limitations at once. The red line is the result. A new problem emerges: Thermals and frequency are now under control, but PCIe cable power draw is hitting 30A at time, averaging about 28-29A. That’s about 344-370W down the PCIe cables, and is going to start generating a lot more heat as a result.
Finally, our undervolted line emerges: The blue line represents an undervolt of -110mv, dropping us from 1200mv to 1090mv. Current is now 23A, for a power consumption of about 283W at the PCIe cables. That’s about 15W more than the stock setup that struggled to maintain 1600MHz, about 87W lower than the power-offset setup that sustains 1600MHz, and should lower thermals as well. Let’s go to that chart.
Undervolting Impact on Thermals – Vega: Frontier Edition
As a note, read the “Items to Consider & Methods” section for clarity on when the red, blue, and orange lines are comparable. The fan speed differences make temperature between the ‘auto’ configuration and other two configurations an indirect comparison; we’re mostly interested in red and blue for this one.
Our orange line again represents the Stock – Auto configuration, which runs a fan curve that isn’t aggressive enough, a voltage that’s too high, and a power budget that’s too low. It’s the worst of all options. The result is constantly hitting the thermal limit and throttling, observed at the 85C mark – though we sometimes observe spikes to nearly 90C.
Applying a 50% power offset and fixing the frequency to 1600MHz, temperatures are about 73C – but the fan is at 70% to control the thermal variable for undervolting. The result is a noise level at a somewhat unbearable 60dBA versus the auto noise level of roughly 50dBA. There’s room to drop the fan speed with the lower voltage, though, because less heat is being generated as less power is consumed. Our point was just to eliminate the thermal concern for our A/B undervolting test; if you were to do this on your own, we’d suggest min-maxing the fan curve to reduce noise. Ours was unnecessarily high, but was a safety to control the thermal variable.
The more appropriate comparison would be our blue line versus our red line, as these two were tested with the same settings aside from just one variable: Voltage to the core. With the exact same fan speed, the same +50% power offset, and with voltage lowered by 110mv, the Vega undervolted card performs at around 63-66C, for about a 7-10C reduction from the card operating at 1200mv.
Pretty good so far. The last question is of frequency.
Vega FE’s Struggle to Maintain Frequency
Plotting frequency, the orange line shows the stock, out of box configuration for the Vega: Frontier Edition air-cooled card. We’re throttling hard, and only rarely achieving 1600MHz; the regularity with which 1600MHz is achieved diminishes significantly as time goes on, largely due to thermal constraints with the default fan curve. We tend to be operating at DPM power state 5-6 rather than state 7, which would give us full performance.
The red and blue lines converge on this chart, as increasing the power target and removing the thermal limit gives us a perfectly flat 1600MHz frequency – closer to what’s advertised on the box. That said, the red line is pulling 344-370W through the PCIe cables, so that’s a little aggressive and may not be worth the power and thermal load over stock. Undervolting, however, permits 1600MHz and draws 87W less power than the red line, but 15W more than the orange line. That’s a damn good trade.
That shows the theory and proves that all of this works well. Our data shows that this undervolting is working, once you learn to work with the applications, and so the next challenge is to determine whether this impacts actual performance.
We’re keeping these tests limited, as 1600MHz sustained will clearly perform better than 1440 to 1528MHz at DPM5-6. 3DMark starts us off, then we’ll look at two gaming workloads. We’ll leave SPECviewperf out for today, as we sort of already showcased that performance ceiling in our Vega Hybrid results.
FireStrike Ultra – Vega Undervolted Benchmark
FireStrike Ultra starts us out. The Vega FE Air card when completely stock ran a graphics score of 4906, with our 50% power offset cards both operating at around 5370 graphics score. This includes the undervolted card, which manages about a 9-10% performance uplift over the stock card. Here’s the crazy thing: Again, we’re not overclocking to achieve this. All we’re doing is making more power available while reducing the voltage, which nets a marginal power consumption spike at the trade of more consistent and faster frametimes. That’s a pretty good trade for 15W, and is far better than the 87W of the power offset without undervolting.
For point of reference, our Hybrid FE overclock performed at 5774, which is 7% faster than the undervolted card. Kind of puts into perspective just how far undervolting and over-powering will get you.
TimeSpy gives us a gain of about 7.6% from the undervolted card over the stock card, with our Hybrid OC gaining another 9.6% on top of that – though drawing significantly more power at around 33A.
Here’s FireStrike Extreme:
As for games, some experienced instability at 1090mv and had to be moved up to 1100mv; For Honor was particularly unstable, and required a core voltage of about 1120mv.
Ghost Recon: Wildlands – Vega FE Undervolted
Let’s look at Ghost Recon first.
At 4K and with VH settings, the undervolted AMD Vega Air card performed at 41FPS AVG, with lows close by at 37 and 36. The stock card with no modifications operated at 37.7FPS AVG, resulting in a performance uplift of 8.8% from the stock card. This uplift is because the stock card cannot maintain 1600MHz without a power offset – but again, a power offset without overvoltage increases your power consumption by 80-90W, thereby increasing thermals that the card deals with. This undervolting and over-powering appears to be the best approach to extracting more performance.
DOOM – Vega FE Undervolted
With DOOM using Vulkan, Async Compute, and rendered at 4K, the Vega undervolted card operates at 71.6FPS AVG, with low-end frametimes also improved over the stock card. Our AVG FPS improvement is about 11.5% in DOOM, following the trend of DOOM being a somewhat best-case scenario for AMD on a routine basis. The performance uplift is tremendous when considering our minimal power consumption increase and better overall control on the card.
Conclusion: Overpower, Undervoltage Far Better than Stock Config
But not without their caveats.
The trouble with this solution is that it is imperfect by nature. First, every chip is not made the same; ours may undervolt better or worse than others out there, and that means there’s no easy “use these numbers” method. You’ll ultimately have to guess and check at stability to find the numbers that work, which means more work is involved in getting this solution to be rock-steady. That’s not to say it’s difficult work, but it’s certainly not as easy as plugging a card in and using it. We found that some games required 1120mv to remain stable, while others were fine at 1090mv. Ideally, you’d make a profile for each application – but that’s a bit annoying, and becomes difficult to maintain. The next option would be to choose the lowest stable voltage for all applications (in our case, that might be 1120mv). You lose some of the efficiency argument when doing this, as the bottom-end is cut off, but still gain overall.
A straight +50% overpower configuration is a huge waste of power down the PCIe cables, which results in running hotter than necessary and thereby louder.
Software is also buggy and frustrating. No, not everyone sees the same issues – that’s the nature of buggy software. It is difficult to precisely pinpoint the issue causing HBM2’s brutal downclocking of -445MHz, but we have seen it happen routinely and on multiple systems with multiple environments. We think that this has to do with manually configuring all 7 DPM states and their corresponding voltage states; when we only configured DPMs 4-7, the downclocking issue did not occur. Fan speed curves are also inaccurate, and report about 200RPM higher than what the user requests. Wattool has the same bugs as WattMan, and Afterburner can’t adjust voltage (yet). The point isn’t to say that it’s impossible to undervolt like we did, it’s just to say that you should really be aware of all the different variables when tweaking. It’s possible to inadvertently hinder performance (in major ways) if HBM2 underclocks without the user’s knowledge. Keep an eye on it.
As for the task at hand, it seems the best possible configuration is to overpower the core (+50%), undervolt the core (roughly -110 to -90mv), and run a fan RPM that keeps temperatures at or below 80C. That’ll depend on your cooling solution, case, and case/room ambient temperatures.
This yields a decent boost to application performance (professional and gaming) without costing the insane +90W draw of a straight +50% overpower configuration.
Editorial: Steve Burke
Video: Andrew Coleman