We took time aside at AMD’s Threadripper & Vega event to speak with leading architects and engineers at the company, including Corporate Fellow Mike Mantor. The conversation eventually became one that we figured we’d film, as we delved deeper into discussion on small primitive discarding and methods to cull unnecessary triangles from the pipeline. Some of the discussion is generic – rules and concepts applied to rendering overall – while some gets more specific to Vega’s architecture.
The interview was sparked from talk about Vega’s primitive shader (or “prim shader”), draw-stream binning rasterization (DSBR), and small primitive discarding. We’ve transcribed large portions of the first half below, leaving the rest in video format. GN’s Andrew Coleman used Unreal Engine and Blender to demonstrate key concepts as Mantor explained them, so we’d encourage watching the video to better conceptualize the more abstract elements of the conversation.
Our recent R7 1700 vs. i7-7700K streaming benchmarks came out in favor of the 1700, as the greater core count made it far easier to handle the simultaneous demands of streaming and gameplay without any overclocking or fiddling with process priority. Streaming isn’t the whole story, of course, and there are many situations (i.e. plain old gaming) where speed is a more valuable resource than sheer number of threads, as seen in our original 1700 review.
Today, we’re testing the R7 1700 and i7-7700K at 1440p 144Hz. We know the i7-7700K is a leader in gaming performance from our earlier CPU-bottlenecked 1080p testing; that isn’t the point here. We’ve also pitted these chips against each other in VR testing, where our conclusion was that GPU choice mattered far more, since both CPUs can deliver 90FPS equally well (and were effectively identical). This newest test is less of a competition and more of a “can the 1700 do it too” scenario. The 1700 has features that make it attractive for casual streaming or rendering, but that doesn’t mean customers want to sacrifice smooth 144Hz in pure gaming scenarios. As we explain thoroughly in the below video, there are different uses for different CPUs; it’s not quite as simple as “that one’s better,” and more accurately boils down to “that one’s better for this specific task, provided said task is your biggest focus.” Maybe that’s the R7 1700 for streaming while gaming, maybe that’s the 7700K for gaming -- but what we haven’t tested is if the 1700 can keep up at 144Hz with higher quality settings. We put to test media statements (including our own) that the 1700 should be “better at streaming,” finding that it is. It is now time to put to test the statements that the 7700K is “better at 144Hz” gaming.
This series is an ongoing venture in our follow-up tests to illustrate that, yes, the two CPUs can both exist side-by-side and can be good at different things. There’s no shame in being a leader in one aspect but not the other, and it’s just generally impossible given current manufacturing and engineering limitations, anyway. The 7700K was the challenger in the streaming benchmarks, and today it will be challenged by the inbound R7 1700 for 144Hz gaming.
People like to make things a bloodbath, but just again to remind everyone: This is less of a “versus” scenario and more of a “can they both do it?” scenario.
X299 VRM thermals have been a topic of interest in the lab lately, as we’ve continued to learn how to work with our new power testing tools and have fully revamped CPU thermal testing. The time will come eventually, but for now, we’ve worked with Buildzoid to run some calculations on VRM thermals with the Gigabyte X299 Gaming 9 motherboard. These numbers are based off of GN testing for this video, where we overclocked the CPU to 4.5~4.6GHz and checked for power consumption at the 8-pin headers (of which there are two).
The Gigabyte X299 Gaming 9 motherboard makes some interesting choices with its VRM components, ultimately balancing between “ridiculous overkill,” to quote Buildzoid, and merely adequacy. The board is one of the higher quality motherboards out there right now, and so is worth a watch on the PCB break-down:
“Good for streaming” – a phrase almost universally attributed to the R7 series of Ryzen CPUs, like the R7 1700 ($270 currently), but with limited data-driven testing to definitively prove the theory. Along with most other folks in the industry, we supported Ryzen as a streamer-oriented platform in our reviews, but we based this assessment on an understanding of Ryzen’s performance in production workloads. Without actual game stream benchmarking, it was always a bit hazy just how the R7 1700 and the i7-7700K ($310 currently) would perform comparatively in game live-streaming.
This new benchmark looks at the AMD R7 1700 vs. Intel i7-7700K performance while streaming, including stream output/framerate, drop frames, streamer-side FPS, power consumption, and some brief thermal data. The goal is to determine not only whether one CPU is better than the other, but whether the difference is large enough to be potentially paradigm-shifting. The article explores all of this, though we’ve also got an embedded video below. If video is your preferred format, consider checking the article conclusion section for some additional thoughts.
This feature benchmark dives into one of the top requests we received from our Patreon backers: Undervolt Vega: Frontier Edition and determine its peak power/performance configuration. The test roped us in immediately, yielding performance uplift largely across the board from preliminary settings tuning. As we dug deeper, once past all the anomalous software issues, we managed to improve Vega: FE Air’s power available to the core, reduce power consumption relative to this, and improve performance in non-trivial ways.
Although power target and core voltage are somewhat tied at the hip, both being tools for overclocking, they don’t govern one another. Power target offset dictates how much additional power budget we’re willing to provide the GPU core (from the power supply) in order to stabilize its clock. GPU Vcore governs the voltage supplied, and will generally range from 900 to 1250mv on Vega: FE cards.
Vega’s native DPM configuration runs its final three states at 1440MHz, 1528MHz, and 1600MHz for the P-states, with DPM7 at 1600MHz/1200mv. This configuration is unsustainable in stock settings, as the core is both power-starved and thermally throttled (we’ll show this in a moment). The thermal limiter on Vega: FE is ~85C, at which point the power and clock will fluctuate hard to try and maintain control of the core temperature. The result is (1) spikey frequencies and frametime latencies, worsening perceived performance, and (2) reduced overall performance as frequency struggles to maintain even 1528MHz (let alone the advertised 1600MHz). To resolve for the thermal issue, we can either configure a more intelligent fan curve than AMD’s stock configuration or create a Hybrid card; unfortunately, we’re still left with a new problem – a power limit.
The power limit can be resolved in large part by offsetting power target by +50%. Making this modification is easy and “fixes” the issue of clock-dropping, but introduces (1) new thermal issues – resolvable by configuring a higher fan RPM, of course, and (2) absurdly high power consumption for a non-linear scaling in performance. In order to truly get value out of this approach, undervolting seems the next appropriate measure. AMD’s native core voltage is far higher than necessary for the card to operate at its 1600MHz target, and so lowering voltage improves performance from the out-of-box config. This is for thermal and power reasons alike. We ultimately see significantly reduced power consumption, to the tune of ~90W in some cases, a more stable core clock and thereby higher performance, and lower temperature – and thereby controllable noise.
We can’t get all the way down to the inner workings of the pump on this one, unfortunately, as all of our source images for the Vega: Frontier Edition – Watercooled card are from a reader. The reader was kind enough to remove the shroud from their new WC version of Vega: FE so that we could get an understanding of the basics, leading us to the conclusion that AMD has built one of the most expensive pre-built liquid cooling solutions for a graphics card.
The video tear-down goes into detail on the images we received, but we’ll revisit most of it here. The card uses the same base PCB, same VRM, same GPU/HBM layout and positioning, and same everything as the air-cooled card. The difference is entirely in the cooling solution, where the Delta VRM fan goes away and is replaced with an additional reservoir (more on that in a moment), while the GPU/VRM cooling is handled by liquid plates and a pump. The die-case finstack atop the I/O is also now gone, and the baseplate is simplified to an aluminum plate with no protrusions.
Liquid-cooling the AMD Vega: Frontier Edition card has proven an educational experience for us, yielding new information about power leakage and solidifying beliefs of a power wall. We also learned that overclocking without thermal barriers (or thermal-induced power barriers) grants significant performance uplift in some scenarios, including gaming and production, though is done at the cost of ~33A from the PSU over 12V PSU power.
Our results for the AMD Vega: Frontier Edition liquid-cooling hybrid mod are in, and this review covers the overclocking scalability, power limits, thermal change, and more.
The Hybrid mod was detailed in build log form over in part 1 of the endeavor. This mod wasn’t as straight-forward as most, seeing as we didn’t have any 64x64mm brackets for securing the liquid cooler to the card. Drilling through an Intel mounting plate for an Asetek cooler, we were ultimately able to get an Asetek 570LC onto the card, which we later equipped with a Gentle Typhoon 120mm fan. VRM FET cooling was handled by aluminum finstacks secured by thermal adhesive, cooled with 1-2x Corsair ML120 fans. That said, this VRM cooling solution also wasn’t necessary – we could have operated with just the fans, and did at one point operate with just the heatsinks (and indirect airflow).
Our newest revisit could also be considered our oldest: the Nehalem microarchitecture is nearly ten years old now, having launched in November 2008 after an initial showing at Intel’s 2007 Developer Forum, and we’re back to revive our i7-930 in 2017.
The sample chosen for these tests is another from the GN personal stash, a well-traveled i7-930 originally from Steve’s own computer that saw service in some of our very first case reviews, but has been mostly relegated to the shelf o’ chips since 2013. The 930 was one of the later Nehalem CPUs, released in Q1 2010 for $294, exactly one year ahead of the advent of the still-popular Sandy Bridge architecture. That includes the release of the i7-2600K, which we’ve already revisited in detail.
Sandy Bridge was a huge step for Intel, but Nehalem processors were actually the first generation to be branded with the now-familiar i5 and i7 naming convention (no i3s, though). A couple features make these CPUs worth a look today: Hyperthreading was (re)introduced with i7 chips, meaning that even the oldest generation of i7s has 4C/8T, and overclocking could offer huge leaps in performance often limited by heat and safe voltages rather than software stability or artificial caps.
We’ve already endured one launch of questionable competence this quarter, looking at X299 and Intel’s KBL-X series, and we nearly escaped Q2 without another. Vega: Frontier Edition has its ups and downs – many of which we’ll discuss in a feature piece next week – but we’re still learning about its quirks. “Gaming Mode” and “Pro Mode” toggling is one of those quirks; leading into this article, it was our understanding – from both AMD representatives and from AMD marketing – that the switch would hold a relevant impact on performance. For this reason, we benchmarked for our review in the “appropriate” mode for each test: Professional applications used pro mode, like SPECviewperf and Blender. Gaming applications used, well, gaming mode. Easy enough, and we figured that was a necessary methodological step to ensure data accuracy to the card’s best abilities.
Turns out, there wasn’t much point.
A quick note, here: The immediate difference when switching to “Gaming Mode” is that WattMan, with all its bugginess, becomes available. Pro Mode does not support WattMan, though you can still overclock through third-party tools – and probably should, anyway, seeing as WattMan presently downclocks memory to Fury X speeds, as it seems to have some leftover code from the Fury X drivers.
That’s the big difference. Aside from WattMan, Gaming Mode technically also offers AMD Chill, something that Pro Mode doesn’t offer a button to use. Other than these interface changes, the implicit, hidden change would be an impact to gaming or to production performance.
Let’s briefly get into that.
Reader and viewer requests piled high after our Vega: Frontier Edition review, so we pulled the most popular one from the stack to benchmark. In today’s feature benchmark, we’re testing Vega: FE vs. the R9 Fury X at equal core clocks, resulting in clock-for-clock testing that could be loosely referred to as an “IPC” test – that’s not exactly the most correct phrasing, but does most quickly convey the intent of the endeavor. We’ll use the phrase “academic exercise” a few times in this piece, as it’s difficult to draw strong conclusions to other Vega products from this test; ultimately, GPUs simply have too many moving parts to simulate easier IPC benchmarks like you’d find on a CPU. As one limitation is resolved, another emerges – and they’re likely different on each architecture.
Regardless, we’re testing the two GPUs clock for clock to see how Vega: FE responds with the Fury X in the ring.
We moderate comments on a ~24~48 hour cycle. There will be some delay after submitting a comment.