Tuesday, upon its institution on the Gregorian calendar, was deemed “product release day” by our long dead-and-rotted ancestors. Today marks the official announcement of the nVidia GTX 1050 and GTX 1050 Ti cards on the GP107 GPU, though additional product announcements will go live on our site by 10AM EST.
The GTX 1050 and 1050 Ti video cards are based on the GP107 GPU with Pascal architecture, sticking to the same SM layout as on previous Pascal GPUs (exception: GP100). Because this is a news announcement, we won't have products in hand for at least another day – but we can fly through the hard specs today and then advise that you return this week for our reviews.
Abstraction layers that sit between the game code and hardware create transactional overhead that worsens software performance on CPUs and GPUs. This has been a major discussion point as DirectX 12 and Vulkan have rolled-out to the market, particularly with DOOM's successful implementation. Long-standing API incumbent Dx 11 sits unmoving between the game engine and the hardware, preventing developers from leveraging specific system resources to efficiently execute game functions or rendering.
Contrary to this, it is possible, for example, to optimize tessellation performance by making explicit changes in how its execution is handled on Pascal, Polaris, Maxwell, or Hawaii architectures. A developer could accelerate performance by directly commanding the GPU to execute code on a reserved set of compute units, or could leverage asynchronous shaders to process render tasks without getting “stuck” behind other instructions in the pipeline. This can't be done with higher level APIs like Dx 11, but DirectX 12 and Vulkan both allow this lower-level hardware access; you may have seen this referred to as “direct to metal,” or “programming to the metal.” These phrases reference that explicit hardware access, and have historically been used to describe what Xbox and Playstation consoles enable for developers. It wasn't until recently that this level of support came to PC.
In our recent return trip to California (see also: Corsair validation lab tour), we visited AMD's offices to discuss shader intrinsic functions and performance acceleration on GPUs by leveraging low-level APIs.
This episode of Ask GN (#28) addresses the concept of HBM in non-GPU applications, primarily concerning its imminent deployment on CPUs. We also explore GPU Boost 3.0 and its variance within testing when working on the new GTX 1080 cards. The question of Boost's functionality arose as a response to our EVGA GTX 1080 FTW Hybrid vs. MSI Sea Hawk 1080 coverage, and asked why one 1080 was clock-dropping differently from another. We talk about that in this episode.
Discussion begins with proof that the Cullinan finally exists and has been sent to us – because it was impossible to find, after Computex – and carries into Knights Landing (Intel) coverage for MCDRAM, or “CPU HBM.” Testing methods are slotted in between, for an explanation on why some hardware choices are made when building a test environment.
The GTX 1060 3GB ($200) card's existence is curious. The card was initially rumored to exist prior to the 1060 6GB's official announcement, and was quickly debunked as mythological. Exactly one month later, nVidia did announce a 3GB GTX 1060 variant – but with one fewer SM, reducing the core count by 10%. That drops the GTX 1060 from 1280 CUDA cores to 1152 CUDA cores (128 cores per SM), alongside 8 fewer TMUs. Of course, there's also the memory reduction from 6GB to 3GB.
The rest of the specs, however, remain the same. The clock-rate has the same baseline 1708MHz boost target, the memory speed remains 8Gbps effective, and the GPU itself is still a declared GP106-400 chip (rev A1, for our sample). That makes this most the way toward a GTX 1060 as initially announced, aside from the disabled SM and halved VRAM. Still, nVidia's marketing language declared a 5% performance loss from the 6GB card (despite a 10% reduction in cores), and so we decided to put those claims to the test.
In this benchmark, we'll be reviewing the EVGA GTX 1060 3GB vs. GTX 1060 6GB performance in a clock-for-clock test, with 100% of the focus on FPS. The goal here is not to look at the potential for marginally changed thermals (which hinges more on AIB cooler than anything) or potentially decreased power, but to instead look strictly at the impact on FPS from the GTX 1060 3GB card's changes. In this regard, we're very much answering the “is a 1060 6GB worth it?” question, just in a less SEF fashion. The GTX 1060s will be clocked the same, within normal GPU Boost 3.0 variance, and will only be differentiated in the SM & VRAM count.
For those curious, we previously took this magnifying glass to the RX 480 8GB & 4GB cards, where we pitted the two against one another in a versus. In that scenario, AMD also reduced the memory clock of the 4GB models, but the rest remained the same.
Buildzoid of “Actually Hardcore Overclocking” joined us to discuss the new EVGA GTX 1080 FTW PCB, as found on the Hybrid that we reviewed days ago. The PCB analysis goes into the power staging, and spends a few minutes explaining the 10-phase VRM, which is really a doubled 5-phase VRM. Amperage supported by the VRM and demanded by the GPU are also discussed, for folks curious about the power delivery capabilities of the FTW PCB, and so is the memory power staging.
If you're curious about the thermal solution of the EVGA FTW Hybrid, check out the review (page 1 & 3) for that. EVGA is somewhat uniquely cooling the VRAM by sinking it to a copper plate, then attaching that to the CLC coldplate. We say “somewhat” because Gigabyte also does this, and we hope to look at their unit soon.
There were rumors of a GTX 1060 3GB card, but the launch of the GTX 1060 featured a single 6GB model. Almost exactly one month later, nVidia has announced its 3GB GTX 1060 with 1152 CUDA Cores, down from 1280, and a halved framebuffer. The card will also run fewer TMUs as a result of disabling 1 SM, for a total of 9 simultaneous multiprocessors versus the 10 SMs on the GTX 1060 6GB. This brings down TMU count from 80 to 72 (with 8x texture map units per SM), making for marginally reduced power coupled with a greatly reduced framebuffer.
(Update: The card is already available on etailers, see here.)
In theory, this will most heavily impact 0.1% low and 1% low frame performance, as we showed in the AMD RX 480 8GB vs. 4GB comparison. Games which rely less upon Post FX and more heavily upon large resolution textures and maps (as in shadow, normal, specular – not as in levels) will most immediately show the difference. Assassin's Creed, Black Ops III (in some use cases), and Mirror's Edge Catalyst are poised to show the greatest differences between the two. NVidia has advertised an approximate 5% performance difference when looking at the GTX 1060 3GB vs. GTX 1060 6GB, but that number will almost certainly be blown out when looking at VRAM stressing titles.
Pascal has mobilized, officially launching in notebooks today. The GTX 1080, 1070, and 1060 full desktop GPUs will be available in Pascal notebooks, similar to the GTX 980 non-M launch from last year. Those earlier 980 laptops were a bit of an experiment, from what nVidia's laptop team told us, and led to wider implementation of the line-up for Pascal.
We had an opportunity to perform preliminary benchmarks using some of our usual test suite while at the London unveil event, including frametime analysis (1% / 0.1% lows) with Shadow of Mordor. Testing was conducted using the exact same settings as we use in our own benchmarks, and we used some of our own software to validate that results were clean.
Before getting to preliminary GTX 1080 & GTX 1070 notebook FPS benchmarks on the Clevo P775 and MSI GT62, we'll run through laptop overclocking, specification differences in the GTX 1070, and 120Hz display updates. Note also that we've got at least three notebooks on the way for testing, and will be publishing reviews through the month. Our own initial benchmarks are further down.
The theoretical end of AMD's Polaris desktop GPU line has just begun shipment, and that's in the form of the RX 460. Back at the pre-Computex press event, AMD informed us that the Polaris line would primarily consist of two GPUs on the Polaris architecture – Polaris 10 & 11 – and that three cards would ship on this platform. Two of the three have already shipped and been reviewed, including the ~$240 RX 480 8GB cards (review here) and ~$180-$200 RX 470 cards (review here). The next architecture will be Vega, in a position to potentially be the first consumer GPU to use HBM2.
Today, we're looking at Polaris 11 in the RX 460. The review sample received is Sapphire's RX 460 Nitro 4GB card, pre-overclocked to 1250MHz. The RX 460, like the 470, is a “partner card,” which means that no reference model will be sold by AMD for rebrand by its partners. AMD has set the MSRP to $110 for the RX 460, but partners will vary widely depending on VRAM capacity (2GB or 4GB), cooler design, pre-overclocks, and component selection. At time of writing, we did not have a list of AIB partner prices and cards available.
As always, we'll be reviewing the Sapphire RX 460 4GB with extensive thermal testing, FPS testing in Overwatch, DOTA2, GTA V, and more, and overclock testing. Be sure to check page 1 for our new PCB analysis and cooler discussion, alongside the in-depth architecture information.
We liked the RX 470 well enough, which, for our site, is certainly considerable praise; we tend to stick just with the numbers and leave most of the decision-making to the reader, but the RX 470 did receive some additional analysis. As we stated in the review, the RX 470 makes good sense as a card priced around $180, but not more than that. That's the key point: Our entire analysis was written on the assumption of a $180 video card, presently fielded only by PowerColor and its Red Devil RX 470. Exceeding the $180 mark on a 4GB 470 immediately invalidates the card, as it enter competition with AMD's own RX 480 4GB model (see: 4GB vs. 8GB VRAM benchmark). Granted, it's still far enough away from the RX 480 8GB & GTX 1060 that the 470 may exist in some isolation. For now, anyway.
But as seems to be the trend with both nVidia and AMD for this generation of graphics cards, the RX 470 has some pricing that at times seems almost silly. Take, for instance, the $220 XFX RX 470 RS Black Edition True OC card: it's $20 more than a 4GB RX 480, it's clocked to where we overclocked on our RX 470, and it will perform about 3-5% slower in AVG FPS than the RX 480 4GB reference card. And let's not start on the seemingly irrelevant $240 8GB RX 470 Nitro+, effectively an RX 480 8GB card (even in clock-rate) with four fewer CUs, fewer TMUs (from 144 to 128), and slower memory – though it does have a better cooling solution, to Sapphire's point.
AMD's RX 470 has been on our time table since May, when the pre-Computex press event informed us of a “mid-July” release. Well, it's mid-July – wait.
August 4th. It's August 4th. The RX 470 is available effective today, coinciding with embargo lift on reviews, and we've had time to thoroughly analyze the card's performance. The RX 470 is a partner card and will not be available as a reference model, though some partner cards may as well be reference models; they're using the reference RX 480 cooler, just with new colors, back-plates, or LEDs.
AMD has positioned its RX 470 in the sub-$200 market, listing its MSRP as $180. AIB partners will price their cards according to any custom coolers or pre-overclocks applied, though the floor has been set, more or less. That plants the 470 in a presently unchallenged market position: AMD's biggest current-gen competition in this price-range is its own RX 480 4GB card, the GTX 1060 being nVidia's lowest tier offering.
Before our deep-dive review on the Sapphire RX 470 Platinum, card architecture, thermal & endurance throttles, power, and FPS, let's run through the specs.