PCIe 3.0 x8 vs. x16: Does It Impact GPU Performance?

By Published June 22, 2016 at 12:28 pm
  •  

This is a test that's been put through the paces for just about every generation of PCI Express, and it's worth refreshing now that the newest line of high-end GPUs has hit the market. The curiosity is this: Will a GPU be bottlenecked by PCI-e 3.0 x8, and how much impact does PCI-e 3.0 x16 have on performance?

We decided to test that question for internal research, but ended up putting together a small report for publication.

PCI Express Theoretical Max Bandwidth

The theoretical maximum bandwidth of PCI-e 3.0 is 8GT/s, or nearly 1GB/s per lane:

  PCI-e 1.0 PCI-e 2.x PCI-e 3.0 PCI-e 4.x
x1 250MB/s 500MB/s 985MB/s 1969MB/s
x4 1000MB/s 2000MB/s 3940MB/s 7876MB/s
x8 2000MB/s 4000MB/s 7880MB/s 15752MB/s
x16 4000MB/s 8000MB/s 15760MB/s 31504MB/s

For our test, we're looking at PCI-e Gen3 x8 vs. PCI-e Gen3 x16 performance. That means there's a 66.7% difference in bandwidth available between the two, or a 100% increase from x8 to x16. But there's a lot more to it than interface bandwidth: The device itself must exceed the saturation point of x8 (7880MB/s, before overhead is removed) in order to show any meaningful advantage in x16 (15760MB/s, before overhead is removed).

Use Cases, Future Tests, & Test Setup

The use cases here are not that large. Maybe you've got a thermal concern or a card that butts-up against the CPU cooler, or some sort of liquid routing challenge. HSIO lanes are assigned to ancillary devices – like PCIe SSDs – and won't eat into the CPU lanes available to the GPU. We're also not testing multiple GPUs, which is where we'd like to go next once we've got two of the same GTX 1080 in the lab. Ideally, we test in x16/x16, x16/x8, and x8/x8 – but that's not possible right now. We're also hoping to test dual-GPU, single-card configurations between an x8 and an x16 slot, as those may put more load on the interface.

For the time being, this test strictly looks at a single-GPU, single-card GTX 1080 Gaming X as it passes between x8 and x16 slots. If, for whatever reason, you're debating the performance reduction from moving to an x8 PCI-e slot with a single card, that's what this test looks into.

We used our normal test bench (detailed below) for this research. The EVGA X99 Classified motherboard is picky with its PCI-e slot utilization, and uses UEFI to clearly inform whether the connected device is receiving 1, 4, 8, or 16 lanes. We switched between the first x16 slot and the first x8 slot for these numbers, then validated in BIOS and software.

PCI-e generations can also be forced in the EVGA UEFI, but we did not explore the impact of PCI-e 2.x on the GTX 1080 at this time as it seemed even less likely of a use case.

Game Test Methodology

We tested using our GPU test bench, detailed in the table below. Our thanks to supporting hardware vendors for supplying some of the test components.

NVidia's 368.39 drivers were used for game (FPS) testing. Game settings were manually controlled for the DUT. All games were run at presets defined in their respective charts. We disable brand-supported technologies in games, like The Witcher 3's HairWorks and HBAO. All other game settings are defined in respective game benchmarks, which we publish separately from GPU reviews. Our test courses, in the event manual testing is executed, are also uploaded within that content. This allows others to replicate our results by studying our bench courses.

Windows 10-64 build 10586 was used for testing.

Each game was tested for 30 seconds in an identical scenario, then repeated multiple times for parity.

Average FPS, 1% low, and 0.1% low times are measured. We do not measure maximum or minimum FPS results as we consider these numbers to be pure outliers. Instead, we take an average of the lowest 1% of results (1% low) to show real-world, noticeable dips; we then take an average of the lowest 0.1% of results for severe spikes.

GN Test Bench 2015 Name Courtesy Of Cost
Video Card This is what we're testing! - -
CPU Intel i7-5930K CPU iBUYPOWER 
$580
Memory Corsair Dominator 32GB 3200MHz Corsair $210
Motherboard EVGA X99 Classified GamersNexus $365
Power Supply NZXT 1200W HALE90 V2 NZXT $300
SSD HyperX Savage SSD Kingston Tech. $130
Case Top Deck Tech Station GamersNexus $250
CPU Cooler NZXT Kraken X41 CLC NZXT $110

For Dx12 and Vulkan API testing, we use built-in benchmark tools and rely upon log generation for our metrics. That data is reported at the engine level.

Video Cards Tested

PCI-e 3.0 x8 vs. x16 FPS Performance

Let's just post all the charts first, then talk numbers – they're similar enough that this is the easiest way to read the data.

Metro: Last Light

pcie-lanes-mll-1440

pcie-lanes-mll-4k

Shadow of Mordor

pcie-lanes-mordor-1440

pcie-lanes-mordor-4k

Call of Duty: Black Ops 3

pcie-lanes-blops-1440

pcie-lanes-blops-4k

GTA V

pcie-lanes-gtav-4k

Ashes of Singularity (Dx12)

pcie-lanes-ashes-dx12

Here's what we've got for performance:

Between AVG FPS metrics in Metro: Last Light, we're seeing a 1.05% gap (1440p) and 0% gap (4K). Between 1% low metrics, that difference is 0.95% (1440p) and 0%.

For Shadow of Mordor, the numbers are similar – we're seeing a 0.93% performance difference between AVG FPS metrics (or ~1% for 4K).

Black Ops 3, when there is a difference, shows one also just below 1%.

GTA V shows a difference of 0.52%. Ashes is similarly small.

Inconsequential Differences & Margins for Error

These numbers are close enough in some instances – like the GTA V 58.3 vs. 58 FPS output – that they're effectively within margin of test error and do not definitively show a performance gap. When a reasonable performance gap is shown – like the ~1% difference in Metro: Last Light numbers – it is imperceptible to the user but measurable with our tools. And we do mean imperceptible – we're talking 96FPS vs. 95FPS, for Metro.

Metro, by the way, is the most reliable FPS benchmarking tool we have ever used. The game produces almost precisely the same AVG, 1% low, and 0.1% lows with every single test pass, and so we trust these metrics as being outside of test variance.

From a quick look, there is a little below a 1% performance difference in PCI-e 3.0 x16 and PCI-e 3.0 x8 slots. The difference is not even close to perceptible and should be ignored as inconsequential to users fretting over potential slot or lane limitations. We are not sure how this scales with SLI (particularly MDA 'mode') or dual-GPU cards, but hope to research once we've got more hardware in the lab.

We are also currently investigating the impact of PCI-e lanes on lower capacity VRAM cards, like 4GB. Hits to system resources may stress the interface more.

Editorial: Steve “Lelldorianx” Burke
Video: Andrew “ColossalCake” Coleman

Last modified on June 22, 2016 at 12:28 pm
Steve Burke

Steve started GamersNexus back when it was just a cool name, and now it's grown into an expansive website with an overwhelming amount of features. He recalls his first difficult decision with GN's direction: "I didn't know whether or not I wanted 'Gamers' to have a possessive apostrophe -- I mean, grammatically it should, but I didn't like it in the name. It was ugly. I also had people who were typing apostrophes into the address bar - sigh. It made sense to just leave it as 'Gamers.'"

First world problems, Steve. First world problems.

We moderate comments on a ~24~48 hour cycle. There will be some delay after submitting a comment.

Advertisement:

  VigLink badge