Our thermal benchmarking has expanded to the point that the tests form our most comprehensive section of any review. For this content, we dig deep into endurance testing with nVidia's just-launched GeForce GTX 1060 Founders Edition card, comparing it to the MSI GTX 1060 Gaming X. The validation testing yields interesting results, particularly with regard to potential throttle points and dips in clock-rate. More on that in a bit.
Today marks the launch of the GTX 1060 ($250-$300), announced about ten days ago. The GTX 1060 fills the mid-range of the market as a 6GB solution on the 16nm FinFET process node debuted in Pascal, and that's done with GP106.
Our GTX 1060 Founders Edition & MSI 1060 Gaming X review looks at FPS (particularly vs. the 1070 and RX 480), Vulkan & Dx12 performance, thermals, noise, power, and overclocking results.
NVIDIA GeForce GTX 1060 Specs vs. GTX 1070, GTX 1080, GTX 960
NVIDIA Pascal vs. Maxwell Specs Comparison | ||||||
GTX 1080 | GTX 1070 | GTX 1060 | GTX 980 Ti | GTX 980 | GTX 960 | |
GPU | GP104-400 Pascal | GP104-200 Pascal | GP106 Pascal | GM200 Maxwell | GM204 Maxwell | GM204 |
Transistor Count | 7.2B | 7.2B | 4.4B | 8B | 5.2B | 2.94B |
Fab Process | 16nm FinFET | 16nm FinFET | 16nm FinFET | 28nm | 28nm | 28nm |
CUDA Cores | 2560 | 1920 | 1280 | 2816 | 2048 | 1024 |
GPCs | 4 | 3 | 2 | 6 | 4 | 2 |
SMs | 20 | 15 | 10 | 22 | 16 | 8 |
TPCs | 20 | 15 | 10 | - | - | - |
TMUs | 160 | 120 | 80 | 176 | 128 | 64 |
ROPs | 64 | 64 | 48 | 96 | 64 | 32 |
Core Clock | 1607MHz | 1506MHz | 1506MHz | 1000MHz | 1126MHz | 1126MHz |
Boost Clock | 1733MHz | 1683MHz | 1708MHz | 1075MHz | 1216MHz | 1178MHz |
FP32 TFLOPs | 9TFLOPs | 6.5TFLOPs | 3.85TFLOPs | 5.63TFLOPs | 5TFLOPs | 2.4TFLOPs |
Memory Type | GDDR5X | GDDR5 | GDDR5 | GDDR5 | GDDR5 | GDDR5 |
Memory Capacity | 8GB | 8GB | 6GB | 6GB | 4GB | 2GB, 4GB |
Memory Clock | 10Gbps GDDR5X | 4006MHz | 8Gbps | 7Gbps GDDR5 | 7Gbps GDDR5 | 7Gbps |
Memory Interface | 256-bit | 256-bit | 192-bit | 384-bit | 256-bit | 128-bit |
Memory Bandwidth | 320.32GB/s | 256GB/s | 192GB/s | 336GB/s | 224GB/s | 115GB/s |
TDP | 180W | 150W | 120W | 250W | 165W | 120W |
Power Connectors | 1x 8-pin | 1x 8-pin | 1x 6-pin | 1x 8-pin 1x 6-pin |
2x 6-pin | 1x 6-pin |
Release Date | 5/27/2016 | 6/10/2016 | 7/19/2016 | 6/01/2015 | 9/18/2014 | 01/22/15 |
Release Price | Reference: $700 MSRP: $600 |
Reference: $450 MSRP: $380 |
Reference: $300 MSRP: $250 |
$650 | $550 | $200 |
Known GTX 1060 Models & Prices
Below is a list of known vendors, card models, and prices. We are also aware that a few other vendors, like Colorful, will be shipping models shortly.
Vendor | Card | Price |
MSI | MSI GTX 1060 Gaming X | $290 |
MSI GTX 1060 Gaming | $280 | |
MSI GTX 1060 Armor | $260 | |
MSI GTX 1060 6GT | $250 | |
EVGA | EVGA GTX 1060 Stock | $250 |
EVGA GTX 1060 SC | $260 | |
PNY | PNY GTX 1060 | $260 |
ASUS | ASUS GTX 1060 Strix Gaming | $330 |
ASUS GTX 1060 Turbo | $250 | |
Gigabyte | Gigabyte GTX 1060 Gaming 6GD | $290 |
Pascal thus far has struggled to maintain both availability and AIB partner prices within the suggested range. As we're writing this review prior to the GTX 1060's public launch, we are yet unsure if the above product listing will be immediately available at listed prices. This launch is supposed to be accompanied with immediate availability, though.
Previous Pascal Architecture & Review Content
-
GTX 1080 Review & GP104-400 Deep-Dive (incl. color compression, compute pre-emption, async compute)
Architecture Revisit (Again) – GP106 Simultaneous Multiprocessors, GPCs, TPCs
Because this is now the fourth Pascal chip that we've written about, we won't be going as deep on the architecture as in previous content. For the P100 Accelerator (and the introduction of Pascal), check this deep-dive with several pages of architectural exploration. To catch up on Pascal as it pertains to gaming cards (GeForce GTX devices), view the first page of our GTX 1080 Founders Edition review.
The GTX 1060 uses a new graphics processor from what we saw with the GTX 1080 and 1070. NVidia's 1060 deploys GP106, a slimmed-down Pascal variant that follows the 1070's GP104-200 and 1080's GP104-400. The block diagram tells most the story on its own. Starting with the GTX 1080 block diagram:
The GTX 1080 GP104-400 GPU runs four GPCs and 20 SMs, equating 2560 CUDA cores. Similar to Maxwell's big GM204 chip, GP104 partitions its instruction cache into two effective shared caches, each owning a dedicated instruction buffer, a dedicated warp scheduler, and two dedicated dispatch units (also, as a note, a dedicated register file of 16,384 x 32-bit per SM). PolyMorph Engine 4.0 sits on top of everything and can be canonically visualized as resting between the raster engine and the SM.
Let's put that into perspective with the biggest Pascal chip presently known, the GP100:
GP100 can host up to 60 SMs and 3584 FP32 CUDA cores, with a 2:1 ratio of FP32:FP64. GP100 has six GPC divisions, ten SMs, with every pair of SMs sharing a single TPC. The Tesla P100 will utilize 56 SMs. A card utilizing the full 60 SM count is not yet officially known. This is not a gaming chip, but did debut Pascal.
If not already obvious, the biggest move has been to cut GPCs (Graphics Processing Clusters – explained in our P100 article), effectively containers for the TPCs and raster engines, down to two.
Here's the GTX 1060 GP106 block diagram:
GP106 follows all the existing rules and divisions of Pascal, so we see a return of the 128-core simultaneous multiprocessor (SM) and a return of the 5 SM allocation per GPC. This lands us at 1280 CUDA Cores on the GTX 1060's GP106 GPU, or 128 cores * 10 SMs. Like the other GTX chips, GP106 dedicates itself to FP32 single precision compute, leaving double precision FP64 CUDA Cores for the science-class GPUs.
Also like the rest of the GeForce Pascal architecture, GP106 runs 8 TMUs per SM, yielding 80 TMUs on the GTX 1060 (alongside 48 ROPs).
The GTX 1060's memory subsystem features the same algorithmic approach to compression as previous nVidia GPUs, including Maxwell's earlier evolution. Color compression and datapath optimizations are popular with nVidia and AMD right now (see: RX 480 review), leveraged as a means to reduce memory power consumption a non-trivial amount; AMD and nVidia both report upwards of 40% memory reduction per bit transacted, depending on the level of compression possible within a given scene.
Memory capacity on the GTX 1060 is a hard 6GB – only one SKU presently exists – and operates at 8Gbps. There is presently no 3GB SKU, nor any indication that one may legitimately exist. Memory bandwidth can sustain 192GB/s (192-bit / 8 [bits to bytes] * 2000 [memory clock] * 2 [DDR] * 2 [GDDR5] = 192GB/s) on the 192-bit wide interface. The GTX 1060 sticks with GDDR5 memory to help manage cost, and because GDDR5X/HBM would serve no meaningful performance benefit in the face of the weaker chip. Six 1GB dies are present on the GTX 1060 PCB, which is a truncated 6.75” long (full card length: 9.5”).
At a higher level, the GTX 1060 Founders Edition runs a clock-rate of 1506MHz base, 1708MHz boost. The MSI Gaming X card that we've got seems to bounce between 1584MHz (base) and 1999.5MHz (boosted in OC mode).
Let's test this thing. See the next page for methodology.