Our GTX 1070 SLI benchmarking endeavor began with an amusing challenge – one which we've captured well in our forthcoming video: The new SLI bridges are all rigid, and that means cards of various heights cannot easily be accommodated as the bridges only work well with same-height cards. After some failed attempts to hack something together, and after researching the usage of two ribbon cables (don't do this – more below), we ultimately realized that a riser cable would work. It's not ideal, but the latency impact should be minimal and the performance is more-or-less representative of real-world SLI framerates for dual GTX 1070s in SLI.
Definitely a fun challenge. Be sure to subscribe for our video later today.
The GTX 1070 SLI configuration teetered in our test rig, no screws holding the second card, but it worked. We've been told that there aren't any plans for ribbon cable versions of the new High Bandwidth Bridges (“HB Bridge”), so this new generation of Pascal GPUs – if using the HB Bridge – will likely drive users toward same-same video card arrays. This step coincides with other simplifications to the multi-GPU process with the 10-series, like a reduction from triple- and quad-SLI to focus just on two-way SLI. We explain nVidia's decision to do this in our GTX 1080 review and mention it in the GTX 1070 review.
This GTX 1070 SLI benchmark tests the framerate of two GTX 1070s vs. a GTX 1080, 980 Ti, 980, 970, Fury X, R9 390X, and more. We briefly look at power requirements as well, helping to provide a guideline for power supply capacity. The joint cost of two GTX 1070s, if buying the lowest-cost GTX 1070s out there, would be roughly $760 – $380*2. The GTX 1070 scales up to $450 for the Founders Edition and likely for some aftermarket AIB partner cards as well.
Full GTX 1070 Specs Listing
NVIDIA Pascal vs. Maxwell Specs Comparison | ||||||
Tesla P100 | GTX 1080 | GTX 1070 | GTX 980 Ti | GTX 980 | GTX 970 | |
GPU | GP100 Cut-Down Pascal | GP104-400 Pascal | GP104-200 Pascal | GM200 Maxwell | GM204 Maxwell | GM204 |
Transistor Count | 15.3B | 7.2B | 7.2B | 8B | 5.2B | 5.2B |
Fab Process | 16nm FinFET | 16nm FinFET | 16nm FinFET | 28nm | 28nm | 28nm |
CUDA Cores | 3584 | 2560 | 1920 | 2816 | 2048 | 1664 |
GPCs | 6 | 4 | 3 | 6 | 4 | 4 |
SMs | 56 | 20 | 15 | 22 | 16 | 13 |
TPCs | 28 TPCs | 20 TPCs | 15 | - | - | - |
TMUs | 224 | 160 | 120 | 176 | 128 | 104 |
ROPs | 96 (?) | 64 | 64 | 96 | 64 | 56 |
Core Clock | 1328MHz | 1607MHz | 1506MHz | 1000MHz | 1126MHz | 1050MHz |
Boost Clock | 1480MHz | 1733MHz | 1683MHz | 1075MHz | 1216MHz | 1178MHz |
FP32 TFLOPs | 10.6TFLOPs | 9TFLOPs | 6.5TFLOPs | 5.63TFLOPs | 5TFLOPs | 3.9TFLOPs |
Memory Type | HBM2 | GDDR5X | GDDR5 | GDDR5 | GDDR5 | GDDR5 |
Memory Capacity | 16GB | 8GB | 8GB | 6GB | 4GB | 4GB |
Memory Clock | ? | 10Gbps GDDR5X | 4006MHz | 7Gbps GDDR5 | 7Gbps GDDR5 | 7Gbps |
Memory Interface | 4096-bit | 256-bit | 256-bit | 384-bit | 256-bit | 256-bit |
Memory Bandwidth | ? | 320.32GB/s | 256GB/s | 336GB/s | 224GB/s | 224GB/s |
TDP | 300W | 180W | 150W | 250W | 165W | 148W |
Power Connectors | ? | 1x 8-pin | 1x 8-pin | 1x 8-pin 1x 6-pin |
2x 6-pin | 2x 6-pin |
Release Date | 4Q16-1Q17 | 5/27/2016 | 6/10/2016 | 6/01/2015 | 9/18/2014 | 9/19/2014 |
Release Price | TBD (Several thousand) |
Reference: $700 MSRP: $600 |
Reference: $450 MSRP: $380 |
$650 | $550 | $330 |
For a more complete specs table and architecture tear-down, check our GTX 1070 review.
Complexities of this Test
This is not a perfect test. We know that, but we feel confident that it's an accurate and good representation of GTX 1070s in SLI. Still, it's our policy to completely explain our test methodology and highlight points where we had to be flexible with our hardware.
There are a few bridge types that exist for SLI, as of this generation:
1920x1080 | 2460x1440 (60Hz) | 2560x1440 (120Hz, 144Hz) | 4K | 5K | Surround | |
Ribbon Bridge | Yes | Yes | No | No | No | No |
LED Bridge | Yes | Yes | Yes | Yes | No | No |
HB Bridge | Yes | Yes | Yes | Yes | Yes | Yes |
This table is pulled from nVidia whitepapers. The company's shift to focus on high-bandwidth bridges for Pascal means that ribbon cables are no longer capable of delivering the throughput (on Pascal architecture) that is demanded for some higher resolution applications. We will validate this independently, but that's the information we have right now. We also asked if two ribbon bridges would provide the bandwidth necessary to effectively equate a rigid HB bridge, and the answer was a firm “no.”
That's because the HB Bridge and LED Bridge both operate at 650MHz, while the ribbon cable bridge has a clock-rate of 400MHz. Two doesn't double the clock-rate; there are some gains, but that clock is still oscillating at a slower overall frequency, and therefore should be limited in its bandwidth gains over a single ribbon cable on Pascal.
LED Bridges are high-bandwidth, comparatively, and operate at the faster clock-rate. These are bridges which were issued in the last generation of cards, often by AIB partners who wanted their own branding (with LEDs) on the bridge. We've got MSI bridge kits from the Maxwell generation, for instance, and used these in our SLI testing. The LED bridge kits are officially rated as supporting up to 4K, but are not recommended for 5K or Surround. Luckily, we're testing neither.
So that's one aspect covered – we're using an MSI LED Bridge kit, which is rated for the tests performed.
But it's a rigid kit, which means that cards must be the same height. Because the GTX 1070 is still new, we're not overflowing with the card – we've only got two, and they're different heights. We currently possess the GTX 1070 FE (reference) and MSI GTX 1070 Gaming X; MSI uses a non-reference PCB for its Gaming X, and has a significantly taller SLI footing as a result.
We tried to figure out how to make the bridge fit, but it just wasn't going to happen. The only good solution was to run a PCI-e riser cable and elevate the reference card, thereby equalizing the heights between the two devices. It worked perfectly, and so we sat the GTX 1070 FE card in the lower slot, using a PCI-e riser cable to buffer its height slightly. We then allowed the LED Bridge to secure the card enough that it wouldn't go anywhere – not a realistic scenario for anyone building within a case, which is effectively everyone, but one which works for our test purposes. There is theoretically some marginal latency impact by the PCI-e riser cable – but that impact is negligible.
We are left with results that provide an accurate, detailed look into SLI GTX 1070 performance, but in a configuration that users won't be replicating without some sort of ribbon cable instead. Going forward, it will be much easier to use cards of equal heights.
Note also that, as always, SLI operates at a joint frequency. Our SLI configuration was operating at a boosted clock-rate of 1898MHz (in-game clock-rate). This is about the same as the in-game, boosted frequency of the Founders Edition card. The advertised clock-rate is 1683MHz, though it tends to burst between 1683MHz and 1880-1890MHz. We did not overclock any devices for this particular test (see 1070 Review – Overclocking page). The MSI GTX 1070 Gaming X and Founders Edition 1070 were running at the same, stock FE frequency for this test.
Game Test Methodology
We tested using our GPU test bench, detailed in the table below. Our thanks to supporting hardware vendors for supplying some of the test components.
The latest AMD drivers (16.6.1) were used for testing. NVidia's 368.39 drivers were used for game (FPS) testing. Game settings were manually controlled for the DUT. All games were run at presets defined in their respective charts. We disable brand-supported technologies in games, like The Witcher 3's HairWorks and HBAO. All other game settings are defined in respective game benchmarks, which we publish separately from GPU reviews. Our test courses, in the event manual testing is executed, are also uploaded within that content. This allows others to replicate our results by studying our bench courses.
Windows 10-64 build 10586 was used for testing.
Each game was tested for 30 seconds in an identical scenario, then repeated three times for parity. Some games have multiple settings or APIs under test, leaving our test matrix to look something like this:
Ashes | Talos | Tomb Raider | Division | GTA V | MLL | Mordor | BLOPS3 | Thermal | Power | Noise | |
NVIDIA CARDS | |||||||||||
GTX 1080 | 4K Crazy 4K High 1080 High Dx12 & Dx11 |
4K Ultra 1440p Ultra 1080p UltraVulkan & Dx11 |
4K VH 1440p VH 1080p VHDx12 & Dx11 |
4K High 1440p High 1080p High |
4K VHU 1080 VHU |
4K HH 1440p VHH 1080p VHH |
4K Ultra 1440p Ultra 1080p Ultra |
4K High 1440p High 1080p High |
Yes | Yes | Yes |
GTX 980 Ti | 4K Crazy 4K High 1080 High Dx12 & Dx11 |
4K Ultra 1440p Ultra 1080p UltraVulkan & Dx11 |
4K VH 1440p VH 1080p VHDx12 & Dx11 |
4K High 1440p High 1080p High |
4K VHU 1080 VHU |
4K HH 1440p VHH 1080p VHH |
4K Ultra 1440p Ultra 1080p Ultra |
4K High 1440p High 1080p High |
Yes | Yes | Yes |
GTX 980 | 4K Crazy 4K High 1080 High Dx12 & Dx11 |
4K Ultra 1440p Ultra 1080p UltraVulkan & Dx11 |
4K VH 1440p VH 1080p VHDx12 & Dx11 |
4K High 1440p High 1080p High |
4K VHU 1080 VHU |
4K HH 1440p VHH 1080p VHH |
4K Ultra 1440p Ultra 1080p Ultra |
4K High 1440p High 1080p High |
Yes | Yes | Yes |
AMD CARDS | |||||||||||
AMD R9 390X | 4K Crazy 4K High 1080 High Dx12 & Dx11 |
4K Ultra 1440p Ultra 1080p UltraVulkan & Dx11 |
4K VH 1440p VH 1080p VHDx12 & Dx11 |
4K High 1440p High 1080p High |
4K VHU 1080 VHU |
4K HH 1440p VHH 1080p VHH |
4K Ultra 1440p Ultra 1080p Ultra |
4K High 1440p High 1080p High |
Yes | Yes | No |
AMD Fury X | 4K Crazy 4K High 1080 High Dx12 & Dx11 |
4K Ultra 1440p Ultra 1080p UltraVulkan & Dx11 |
4K VH 1440p VH 1080p VHDx12 & Dx11 |
4K High 1440p High 1080p High |
4K VHU 1080 VHU |
4K HH 1440p VHH 1080p VHH |
4K Ultra 1440p Ultra 1080p Ultra |
4K High 1440p High 1080p High |
Yes | Yes | Yes |
Average FPS, 1% low, and 0.1% low times are measured. We do not measure maximum or minimum FPS results as we consider these numbers to be pure outliers. Instead, we take an average of the lowest 1% of results (1% low) to show real-world, noticeable dips; we then take an average of the lowest 0.1% of results for severe spikes.
GN Test Bench 2015 | Name | Courtesy Of | Cost |
Video Card | This is what we're testing! | - | - |
CPU | Intel i7-5930K CPU | iBUYPOWER |
$580 |
Memory | Corsair Dominator 32GB 3200MHz | Corsair | $210 |
Motherboard | EVGA X99 Classified | GamersNexus | $365 |
Power Supply | NZXT 1200W HALE90 V2 | NZXT | $300 |
SSD | HyperX Savage SSD | Kingston Tech. | $130 |
Case | Top Deck Tech Station | GamersNexus | $250 |
CPU Cooler | NZXT Kraken X41 CLC | NZXT | $110 |
For Dx12 and Vulkan API testing, we use built-in benchmark tools and rely upon log generation for our metrics. That data is reported at the engine level.
Video Cards Tested
- NVIDIA GTX 1080 Founders Edition ($700)
- NVIDIA GTX 1070 Founders Edition ($450)
- MSI GTX 1070 Gaming X
- NVIDIA GTX 980 Ti Reference ($650)
- NVIDIA GTX 980 Reference ($460)
- NVIDIA GTX 980 2x SLI Reference ($920)
- AMD R9 Fury X 4GB HBM ($630)
- AMD MSI R9 390X 8GB ($460)
Thermal Test Methodology
We strongly believe that our thermal testing methodology is the best on this side of the tech-media industry. We've validated our testing methodology with thermal chambers and have proven near-perfect accuracy of results.
Conducting thermal tests requires careful measurement of temperatures in the surrounding environment. We control for ambient by constantly measuring temperatures with K-Type thermocouples and infrared readers. We then produce charts using a Delta T(emperature) over Ambient value. This value subtracts the thermo-logged ambient value from the measured diode temperatures, producing a delta report of thermals. AIDA64 is used for logging thermals of silicon components, including the GPU diode. We additionally log core utilization and frequencies to ensure all components are firing as expected. Voltage levels are measured in addition to fan speeds, frequencies, and thermals. GPU-Z is deployed for redundancy and validation against AIDA64.
All open bench fans are configured to their maximum speed and connected straight to the PSU. This ensures minimal variance when testing, as automatically controlled fan speeds will reduce reliability of benchmarking. The CPU fan is set to use a custom fan curve that was devised in-house after a series of testing. We use a custom-built open air bench that mounts the CPU radiator out of the way of the airflow channels influencing the GPU, so the CPU heat is dumped where it will have no measurable impact on GPU temperatures.
We use an AMPROBE multi-diode thermocouple reader to log ambient actively. This ambient measurement is used to monitor fluctuations and is subtracted from absolute GPU diode readings to produce a delta value. For these tests, we configured the thermocouple reader's logging interval to 1s, matching the logging interval of GPU-Z and AIDA64. Data is calculated using a custom, in-house spreadsheet and software solution.
Endurance tests are conducted for new architectures or devices of particular interest, like the GTX 1080, R9 Fury X, or GTX 980 Ti Hybrid from EVGA. These endurance tests report temperature versus frequency (sometimes versus FPS), providing a look at how cards interact in real-world gaming scenarios over extended periods of time. Because benchmarks do not inherently burn-in a card for a reasonable play period, we use this test method as a net to isolate and discover issues of thermal throttling or frequency tolerance to temperature.
Our test starts with a two-minute idle period to gauge non-gaming performance. A script automatically triggers the beginning of a GPU-intensive benchmark running MSI Kombustor – Titan Lakes for 1080s. Because we use an in-house script, we are able to perfectly execute and align our tests between passes.
Power Testing Methodology
Power consumption is measured at the system level. You can read a full power consumption guide and watt requirements here. When reading power consumption charts, do not read them as a GPU-specific requirements – this is a system-level power draw.
Power draw is measured during the thermal burn-in. We use a logging wall meter that sits between the PSU and the system and logs power consumption over the test period. We select the final 200s of data and average the data points.
We use a different bench platform for power measurements; see below:
GN Z97 Bench | Name | Courtesy Of | Cost |
Video Card | This is what we're measuring! | - | - |
CPU | Intel i7-4790K CPU |
CyberPower | $340 |
Memory | 32GB 2133MHz HyperX Savage RAM | Kingston Tech. | $300 |
Motherboard | Gigabyte Z97X Gaming G1 | GamersNexus | $285 |
Power Supply | Enermax Platimax 1350W | Enermax | $272 |
SSD | HyperX Predator PCI-e SSD Samsung 850 Pro 1TB |
Kingston Tech. Samsung |
|
Case | Top Deck Tech Station | GamersNexus | $250 |
CPU Cooler | Be Quiet! Dark Rock 3 | Be Quiet! | ~$60 |
Continue to the next page for Dx12, OpenGL, & Vulkan SLI benchmarks on 2x 1070s.