The first and last of AMD’s Polaris GPUs hit the market last year, among them the RX 460 and subsequent Sapphire RX 460 Nitro 4GB, a card that underwhelmed us with unimpressive performance and an ambitious price. Just a few months later, overclocker der8auer implemented a BIOS flash to unlock additional stream processors on some RX 460 cards, bringing the count from 896 to 1024 by just flashing the card BIOS.
AMD’s Vega GPU architecture has received cursory details pertaining to high-bandwidth caching, an iterative step to CUs (NCUs), and a unified-but-not-unified memory configuration.
Going into this, note that we’re still not 100% briefed on Vega. We’ve worked with AMD to try and better understand the architecture, but the details aren’t fully organized for press just yet; we’re also not privy to product details at this time, which would be those more closely associated with shader counts, memory capacity, and individual SKUs. Instead, we have some high-level architecture discussion. It’s enough for a start.
The second card in our “revisit” series – sort of semi-re-reviews – is the GTX 780 Ti from November of 2013, which originally shipped for $700. This was the flagship of the Kepler architecture, followed later by Maxwell architecture on GTX 900 series GPUs, and then the modern Pascal. The 780 Ti was in competition with AMD’s R9 200 series and (a bit later) R9 300 series cards, and was accompanied by the expected 780, 770, and 760 video cards.
Our last revisit looked at the GTX 770 2GB card, and our next one plans to look at an AMD R9 200-series card. For today, we’re revisiting the GTX 780 Ti 3GB card for an analysis of its performance in 2016, as pitted against the modern GTX 1080, 1070, 1060, 1050 Ti, and RX 480, 470, and others.
Two EVGA GTX 1080 FTW cards have now been run through a few dozen hours of testing, each passing through real-world, synthetic, and torture testing. We've been following this story since its onset, initially validating preliminary thermal results with thermal imaging, but later stating that we wanted to follow-up with direct thermocouple probes to the MOSFETs and PCB. The goal with which we set forth was to create the end-all, be-all set of test data for VRM thermals. We have tested every reasonable scenario for these cards, including SLI, and have even intentionally attempted to incinerate the cards by running ridiculous use scenarios.
Thermocouples were attached directly to the back-side of the PCB (hotspot previously discovered), the opposing MOSFET (#2, from bottom-up), and MOSFET #7. The seventh and second MOSFETs are those which seem to be most commonly singed or scorched in user photos of allegedly failed EVGA 10-series ACX 3.0 cards, including the GTX 1060 and GTX 1070. Our direct probe contact to these MOSFETs will provide more finality to testing results, with significantly greater accuracy and understanding than can be achieved with a thermal imager pointed at the rear-side of the PCB. Even just testing with a backplate isn't really ideal with thermal cameras, as the emissivity of the metal begins to make for questionable results -- not to mention the fact that the plate visually obstructs the actual components. And, although we did mirror EVGA & Tom's DE's testing methodology when checking the impact of thermal pads on the cards, even this approach is not perfect (it does turn out that we were pretty damn accurate, though, but it's not perfect. More on that later.). The pads act as an insulator, again hiding the components and assisting in the spread of heat across a larger surface area. That's what they're designed to do, of course, but for a true reading, we needed today's tests.
AMD issued a preemptive response to nVidia's new GTX 1050 and GTX 1050 Ti, and they did it by dropping the RX 460 MSRP to $100 and RX 470 MSRP to $170. The price reduction's issuance is to battle the GTX 1050, a $110 MSRP card, and GTX 1050 Ti, a $140-$170 card. These new Pascal-family devices are targeted most appropriately at the 1080p crowd, where the GTX 1060 and up were all capable performers for most 1440p gaming scenarios. AMD has held the sub-$200 market since the launch of its RX 480 4GB, RX 470, and RX 460 through the summer months, and is just now seeing its competition's gaze shift from the high-end.
Today, we've got thermal, power, and overclocking benchmarks for the GTX 1050 and GTX 1050 Ti cards. Our FPS benchmarks look at the GTX 1050 OC and GTX 1050 Ti Gaming X cards versus the RX 460, RX 470, GTX 950, 750 Ti, and 1060 devices. Some of our charts include higher-end devices as well, though you'd be better off looking at our GTX 1060 or RX 480 content for more on that. Here's a list of recent and relevant articles:
Tuesday, upon its institution on the Gregorian calendar, was deemed “product release day” by our long dead-and-rotted ancestors. Today marks the official announcement of the nVidia GTX 1050 and GTX 1050 Ti cards on the GP107 GPU, though additional product announcements will go live on our site by 10AM EST.
The GTX 1050 and 1050 Ti video cards are based on the GP107 GPU with Pascal architecture, sticking to the same SM layout as on previous Pascal GPUs (exception: GP100). Because this is a news announcement, we won't have products in hand for at least another day – but we can fly through the hard specs today and then advise that you return this week for our reviews.
Abstraction layers that sit between the game code and hardware create transactional overhead that worsens software performance on CPUs and GPUs. This has been a major discussion point as DirectX 12 and Vulkan have rolled-out to the market, particularly with DOOM's successful implementation. Long-standing API incumbent Dx 11 sits unmoving between the game engine and the hardware, preventing developers from leveraging specific system resources to efficiently execute game functions or rendering.
Contrary to this, it is possible, for example, to optimize tessellation performance by making explicit changes in how its execution is handled on Pascal, Polaris, Maxwell, or Hawaii architectures. A developer could accelerate performance by directly commanding the GPU to execute code on a reserved set of compute units, or could leverage asynchronous shaders to process render tasks without getting “stuck” behind other instructions in the pipeline. This can't be done with higher level APIs like Dx 11, but DirectX 12 and Vulkan both allow this lower-level hardware access; you may have seen this referred to as “direct to metal,” or “programming to the metal.” These phrases reference that explicit hardware access, and have historically been used to describe what Xbox and Playstation consoles enable for developers. It wasn't until recently that this level of support came to PC.
In our recent return trip to California (see also: Corsair validation lab tour), we visited AMD's offices to discuss shader intrinsic functions and performance acceleration on GPUs by leveraging low-level APIs.
This episode of Ask GN (#28) addresses the concept of HBM in non-GPU applications, primarily concerning its imminent deployment on CPUs. We also explore GPU Boost 3.0 and its variance within testing when working on the new GTX 1080 cards. The question of Boost's functionality arose as a response to our EVGA GTX 1080 FTW Hybrid vs. MSI Sea Hawk 1080 coverage, and asked why one 1080 was clock-dropping differently from another. We talk about that in this episode.
Discussion begins with proof that the Cullinan finally exists and has been sent to us – because it was impossible to find, after Computex – and carries into Knights Landing (Intel) coverage for MCDRAM, or “CPU HBM.” Testing methods are slotted in between, for an explanation on why some hardware choices are made when building a test environment.
The GTX 1060 3GB ($200) card's existence is curious. The card was initially rumored to exist prior to the 1060 6GB's official announcement, and was quickly debunked as mythological. Exactly one month later, nVidia did announce a 3GB GTX 1060 variant – but with one fewer SM, reducing the core count by 10%. That drops the GTX 1060 from 1280 CUDA cores to 1152 CUDA cores (128 cores per SM), alongside 8 fewer TMUs. Of course, there's also the memory reduction from 6GB to 3GB.
The rest of the specs, however, remain the same. The clock-rate has the same baseline 1708MHz boost target, the memory speed remains 8Gbps effective, and the GPU itself is still a declared GP106-400 chip (rev A1, for our sample). That makes this most the way toward a GTX 1060 as initially announced, aside from the disabled SM and halved VRAM. Still, nVidia's marketing language declared a 5% performance loss from the 6GB card (despite a 10% reduction in cores), and so we decided to put those claims to the test.
In this benchmark, we'll be reviewing the EVGA GTX 1060 3GB vs. GTX 1060 6GB performance in a clock-for-clock test, with 100% of the focus on FPS. The goal here is not to look at the potential for marginally changed thermals (which hinges more on AIB cooler than anything) or potentially decreased power, but to instead look strictly at the impact on FPS from the GTX 1060 3GB card's changes. In this regard, we're very much answering the “is a 1060 6GB worth it?” question, just in a less SEF fashion. The GTX 1060s will be clocked the same, within normal GPU Boost 3.0 variance, and will only be differentiated in the SM & VRAM count.
For those curious, we previously took this magnifying glass to the RX 480 8GB & 4GB cards, where we pitted the two against one another in a versus. In that scenario, AMD also reduced the memory clock of the 4GB models, but the rest remained the same.
Buildzoid of “Actually Hardcore Overclocking” joined us to discuss the new EVGA GTX 1080 FTW PCB, as found on the Hybrid that we reviewed days ago. The PCB analysis goes into the power staging, and spends a few minutes explaining the 10-phase VRM, which is really a doubled 5-phase VRM. Amperage supported by the VRM and demanded by the GPU are also discussed, for folks curious about the power delivery capabilities of the FTW PCB, and so is the memory power staging.
If you're curious about the thermal solution of the EVGA FTW Hybrid, check out the review (page 1 & 3) for that. EVGA is somewhat uniquely cooling the VRAM by sinking it to a copper plate, then attaching that to the CLC coldplate. We say “somewhat” because Gigabyte also does this, and we hope to look at their unit soon.