The GTX 1080's epochal launch all but overshadowed its cut-down counterpart – that is, until the price was unveiled. NVidia's GTX 1070 is promised at an initial $450 price-point for the Founders Edition (explained here), or an MSRP of $380 for board partner models. The GTX 1070 replaces nVidia's GTX 970 in the vertical, but promises superior performance to previous high-end models like the 980 and 980 Ti; we'll validate those claims in our testing below, following an initial architecture overview.
The GeForce GTX 1070 ($450) uses a Pascal GP104-200 chip. The architecture is identical to the GTX 1080 and its GP104-400 GPU, but cuts-down on SM presence (and core count) to create a mid-range version of the new 16nm FinFET architecture. This new node from TSMC is nearly half the size of Maxwell's 28nm Planar process, and switches the company over to FinFET transistor architecture for reduced power leakage and overall improved performance-per-watt efficiency. The trend is symptomatic of an industry trending toward ever-smaller devices with a greater concern on the power envelope, and has been reflected in nVidia's architectures since Fermi (GTX 400 series running notoriously hot) and AMD's since Fiji (sort of – Polaris claims to make a bigger push in this direction). On the CPU side, Intel has been driving this trend for several generations now, its 10nm process making promises to further extend mobile device endurance and transistor density.
Before getting started, here's a list of relevant Pascal & GTX 1080 articles we wrote:
-
GTX 1080 Founders Edition Review & In-Depth Benchmark (9000 words!)
-
Building Our Own GTX 1080 Hybrid – Tearing Apart the Founders Edition
Our review of the GTX 1070 Founders Edition graphics card will compare it vs. the GTX 1080, 980 Ti, 970 (and more) and AMD's R9 Fury X & R9 390X. We'll look at performance (FPS) benchmarks and our specialized thermal testing with endurance (clock-rate vs. time) burn-in. These tests explore major points of issue for new GPUs, and provide a closer look at real-world gaming performance. We're in Taipei, Taiwan right now for Computex, and so we've culled a few tests due to travel complications. Power draw did not make it into this review.
NVIDIA GeForce GTX 1070 vs. GTX 1080, 980 Ti, 970, & 390X [Video]
NVIDIA GeForce GTX 1070 Specs vs. 1080, 970, 980 Ti
NVIDIA Pascal vs. Maxwell Specs Comparison | ||||||
Tesla P100 | GTX 1080 | GTX 1070 | GTX 980 Ti | GTX 980 | GTX 970 | |
GPU | GP100 Cut-Down Pascal | GP104-400 Pascal | GP104-200 Pascal | GM200 Maxwell | GM204 Maxwell | GM204 |
Transistor Count | 15.3B | 7.2B | 7.2B | 8B | 5.2B | 5.2B |
Fab Process | 16nm FinFET | 16nm FinFET | 16nm FinFET | 28nm | 28nm | 28nm |
CUDA Cores | 3584 | 2560 | 1920 | 2816 | 2048 | 1664 |
GPCs | 6 | 4 | 3 | 6 | 4 | 4 |
SMs | 56 | 20 | 15 | 22 | 16 | 13 |
TPCs | 28 TPCs | 20 TPCs | 15 | - | - | - |
TMUs | 224 | 160 | 120 | 176 | 128 | 104 |
ROPs | 96 (?) | 64 | 64 | 96 | 64 | 56 |
Core Clock | 1328MHz | 1607MHz | 1506MHz | 1000MHz | 1126MHz | 1050MHz |
Boost Clock | 1480MHz | 1733MHz | 1683MHz | 1075MHz | 1216MHz | 1178MHz |
FP32 TFLOPs | 10.6TFLOPs | 9TFLOPs | 6.5TFLOPs | 5.63TFLOPs | 5TFLOPs | 3.9TFLOPs |
Memory Type | HBM2 | GDDR5X | GDDR5 | GDDR5 | GDDR5 | GDDR5 |
Memory Capacity | 16GB | 8GB | 8GB | 6GB | 4GB | 4GB |
Memory Clock | ? | 10Gbps GDDR5X | 4006MHz | 7Gbps GDDR5 | 7Gbps GDDR5 | 7Gbps |
Memory Interface | 4096-bit | 256-bit | 256-bit | 384-bit | 256-bit | 256-bit |
Memory Bandwidth | ? | 320.32GB/s | 256GB/s | 336GB/s | 224GB/s | 224GB/s |
TDP | 300W | 180W | 150W | 250W | 165W | 148W |
Power Connectors | ? | 1x 8-pin | 1x 8-pin | 1x 8-pin 1x 6-pin |
2x 6-pin | 2x 6-pin |
Release Date | 4Q16-1Q17 | 5/27/2016 | 6/10/2016 | 6/01/2015 | 9/18/2014 | 9/19/2014 |
Release Price | TBD (Several thousand) |
Reference: $700 MSRP: $600 |
Reference: $450 MSRP: $380 |
$650 | $550 | $330 |
Architecture Revisit – GP104-200 Simultaneous Multiprocessors, GPCs, TPCs
The GTX 1070 utilizes the same Pascal GP104 architecture as found on the GTX 1080, though the *-200 subversion (rather than *-400) does bring some changes. Those changes are mostly to core count and clock speed.
The silicon is the same, the architecture is mostly the same, but the die has been somewhat simplified on the GTX 1070 to reduce cost. The heart of the chip is still 16nm FinFET design, which operates at slightly lower voltage than planar process and exhibits less power leakage than planar. Datapath optimizations are also in-place for performance improvements, something we spent a few thousand words on in our 1080 review.
(Above: The GP104-400 block diagram. Remove one GPC -- that's basically GP104-200.)
(Above: SM architecture on Pascal / GP104.)
NVidia's GTX 1070 runs 15 SMs rather than the 20 SMs of GP104-400, reducing core count to 1920 and TMUs to 120 (capable of 202GT/s). The clock boosts to 1683MHz, but has OC headroom that we play with later in this review.
Another major change from the 1080 is the GTX 1070's usage of GDDR5 8Gbps memory, rather than the new GDDR5X 10Gbps memory of the GTX 1080. This reduces cost of the card by mounting a more ubiquitous memory platform to the device.
To learn about asynchronous compute and memory subsystems on Pascal, check out our 9000-word review on the GTX 1080.
Game Test Methodology
We tested using our GPU test bench, detailed in the table below. Our thanks to supporting hardware vendors for supplying some of the test components.
The latest AMD drivers (16.15.2.1 Doom-ready) were used for testing. NVidia's unreleased press drivers were used for game (FPS) testing. Game settings were manually controlled for the DUT. All games were run at presets defined in their respective charts. We disable brand-supported technologies in games, like The Witcher 3's HairWorks and HBAO. All other game settings are defined in respective game benchmarks, which we publish separately from GPU reviews. Our test courses, in the event manual testing is executed, are also uploaded within that content. This allows others to replicate our results by studying our bench courses.
Windows 10-64 build 10586 was used for testing.
Each game was tested for 30 seconds in an identical scenario, then repeated three times for parity. Some games have multiple settings or APIs under test, leaving our test matrix to look something like this:
Ashes | Talos | Tomb Raider | Division | GTA V | MLL | Mordor | BLOPS3 | Thermal | Power | Noise | |
NVIDIA CARDS | |||||||||||
GTX 1080 | 4K Crazy 4K High 1080 High Dx12 & Dx11 |
4K Ultra 1440p Ultra 1080p UltraVulkan & Dx11 |
4K VH 1440p VH 1080p VHDx12 & Dx11 |
4K High 1440p High 1080p High |
4K VHU 1080 VHU |
4K HH 1440p VHH 1080p VHH |
4K Ultra 1440p Ultra 1080p Ultra |
4K High 1440p High 1080p High |
Yes | Yes | Yes |
GTX 980 Ti | 4K Crazy 4K High 1080 High Dx12 & Dx11 |
4K Ultra 1440p Ultra 1080p UltraVulkan & Dx11 |
4K VH 1440p VH 1080p VHDx12 & Dx11 |
4K High 1440p High 1080p High |
4K VHU 1080 VHU |
4K HH 1440p VHH 1080p VHH |
4K Ultra 1440p Ultra 1080p Ultra |
4K High 1440p High 1080p High |
Yes | Yes | Yes |
GTX 980 | 4K Crazy 4K High 1080 High Dx12 & Dx11 |
4K Ultra 1440p Ultra 1080p UltraVulkan & Dx11 |
4K VH 1440p VH 1080p VHDx12 & Dx11 |
4K High 1440p High 1080p High |
4K VHU 1080 VHU |
4K HH 1440p VHH 1080p VHH |
4K Ultra 1440p Ultra 1080p Ultra |
4K High 1440p High 1080p High |
Yes | Yes | Yes |
AMD CARDS | |||||||||||
AMD R9 390X | 4K Crazy 4K High 1080 High Dx12 & Dx11 |
4K Ultra 1440p Ultra 1080p UltraVulkan & Dx11 |
4K VH 1440p VH 1080p VHDx12 & Dx11 |
4K High 1440p High 1080p High |
4K VHU 1080 VHU |
4K HH 1440p VHH 1080p VHH |
4K Ultra 1440p Ultra 1080p Ultra |
4K High 1440p High 1080p High |
Yes | Yes | No |
AMD Fury X | 4K Crazy 4K High 1080 High Dx12 & Dx11 |
4K Ultra 1440p Ultra 1080p UltraVulkan & Dx11 |
4K VH 1440p VH 1080p VHDx12 & Dx11 |
4K High 1440p High 1080p High |
4K VHU 1080 VHU |
4K HH 1440p VHH 1080p VHH |
4K Ultra 1440p Ultra 1080p Ultra |
4K High 1440p High 1080p High |
Yes | Yes | Yes |
Average FPS, 1% low, and 0.1% low times are measured. We do not measure maximum or minimum FPS results as we consider these numbers to be pure outliers. Instead, we take an average of the lowest 1% of results (1% low) to show real-world, noticeable dips; we then take an average of the lowest 0.1% of results for severe spikes.
GN Test Bench 2015 | Name | Courtesy Of | Cost |
Video Card | This is what we're testing! | - | - |
CPU | Intel i7-5930K CPU | iBUYPOWER |
$580 |
Memory | Corsair Dominator 32GB 3200MHz | Corsair | $210 |
Motherboard | EVGA X99 Classified | GamersNexus | $365 |
Power Supply | NZXT 1200W HALE90 V2 | NZXT | $300 |
SSD | HyperX Savage SSD | Kingston Tech. | $130 |
Case | Top Deck Tech Station | GamersNexus | $250 |
CPU Cooler | NZXT Kraken X41 CLC | NZXT | $110 |
For Dx12 and Vulkan API testing, we use built-in benchmark tools and rely upon log generation for our metrics. That data is reported at the engine level.
Video Cards Tested
- NVIDIA GTX 1080 Founders Edition ($700)
- NVIDIA GTX 980 Ti Reference ($650)
- NVIDIA GTX 980 Reference ($460)
- NVIDIA GTX 980 2x SLI Reference ($920)
- AMD R9 Fury X 4GB HBM ($630)
- AMD MSI R9 390X 8GB ($460)
Thermal Test Methodology
We strongly believe that our thermal testing methodology is the best on this side of the tech-media industry. We've validated our testing methodology with thermal chambers and have proven near-perfect accuracy of results.
Conducting thermal tests requires careful measurement of temperatures in the surrounding environment. We control for ambient by constantly measuring temperatures with K-Type thermocouples and infrared readers. We then produce charts using a Delta T(emperature) over Ambient value. This value subtracts the thermo-logged ambient value from the measured diode temperatures, producing a delta report of thermals. AIDA64 is used for logging thermals of silicon components, including the GPU diode. We additionally log core utilization and frequencies to ensure all components are firing as expected. Voltage levels are measured in addition to fan speeds, frequencies, and thermals. GPU-Z is deployed for redundancy and validation against AIDA64.
All open bench fans are configured to their maximum speed and connected straight to the PSU. This ensures minimal variance when testing, as automatically controlled fan speeds will reduce reliability of benchmarking. The CPU fan is set to use a custom fan curve that was devised in-house after a series of testing. We use a custom-built open air bench that mounts the CPU radiator out of the way of the airflow channels influencing the GPU, so the CPU heat is dumped where it will have no measurable impact on GPU temperatures.
We use an AMPROBE multi-diode thermocouple reader to log ambient actively. This ambient measurement is used to monitor fluctuations and is subtracted from absolute GPU diode readings to produce a delta value. For these tests, we configured the thermocouple reader's logging interval to 1s, matching the logging interval of GPU-Z and AIDA64. Data is calculated using a custom, in-house spreadsheet and software solution.
Endurance tests are conducted for new architectures or devices of particular interest, like the GTX 1080, R9 Fury X, or GTX 980 Ti Hybrid from EVGA. These endurance tests report temperature versus frequency (sometimes versus FPS), providing a look at how cards interact in real-world gaming scenarios over extended periods of time. Because benchmarks do not inherently burn-in a card for a reasonable play period, we use this test method as a net to isolate and discover issues of thermal throttling or frequency tolerance to temperature.
Our test starts with a two-minute idle period to gauge non-gaming performance. A script automatically triggers the beginning of a GPU-intensive benchmark running MSI Kombustor – Titan Lakes for 1080s. Because we use an in-house script, we are able to perfectly execute and align our tests between passes.