Consoles don’t offer many upgrade paths, but HDDs, like the ones that ship in the Xbox One X, are one of the few items that can be exchanged for a standard part with higher performance. Since 2013, there have been quite a few benchmarks done with SSDs vs. HDDs in various SKUs of the Xbox One, but not so many with the Xbox One X--so we’re doing our own. We’ve seen some abysmal load times in Final Fantasy and some nasty texture loading in PUBG, so there’s definitely room for improvement somewhere.
The 1TB drive that was shipped in our Xbox One X is a Seagate 2.5” hard drive, model ST1000LM035. This is absolutely, positively a 5400RPM drive, as we said in our teardown, and not a 7200RPM drive (as some suggest online). Even taking the 140MB/s peak transfer rate listed in the drive’s data sheet completely at face value, it’s nowhere near bottlenecking on the internal SATA III interface. The SSD is up against SATA III (or USB 3.0 Gen1) limitations, but will still give us a theoretical sequential performance uplift of 4-5x -- and that’s assuming peak bursted speeds on the hard drive.
This benchmark tests game load times on an external SSD for the Xbox One X, versus internal HDD load times for Final Fantasy XV (FFXV), Monster Hunter World, PUBG (incl. texture pop-in), Assassin's Creed: Origins, and more.
Even when using supposed “safe” voltages as a maximum input limit for overclocking via BIOS, it’s possible that the motherboard is feeding a significantly different voltage to the CPU. We’ve demonstrated this before, like when we talked about the Ultra Gaming’s Vdroop issues. The opposite side of Vdroop would be overvoltage, of course, and is also quite common. Inputting a value of 1.3V SOC, for instance, could yield a socket-side voltage measurement of ~1.4V. This difference is significant enough that you may exit territory of being “reasonably usable” and enter “will definitely degrade the IMC over time.”
But software measurements won’t help much, in this regard. HWINFO is good, AIDA also does well, but both are relying on the CPU sensors to deliver that information. The pin/pad resistances alone can cause that number to underreport in software, whereas measuring the back of the socket with a digital multimeter (DMM) could tell a very different story.
CPUs with integrated graphics always make memory interesting. Memory’s commoditization, ignoring recent price trends, has made it an item where you sort of pick what’s cheap and just buy it. With something like AMD’s Raven Ridge APUs, that memory choice could have a lot more impact than a budget gaming PC with a discrete GPU. We’ll be testing a handful of memory kits with the R5 2400G in today’s content, including single- versus dual-channel testing where all timings have been equalized. We’re also testing a few different motherboards with the same kit of memory, useful for determining how timings change between boards.
We’re splitting these benchmarks into two sections: First, we’ll show the impact of various memory kits on performance when tested on a Gigabyte Gaming K5 motherboard, and we’ll then move over to demonstrate how a few popular motherboards affect results when left to auto XMP timings. We are focusing on memory scalability performance today, with a baseline provided by the G4560 and R3 GT1030 tests we ran a week ago. We’ll get to APU overclocking in a future content piece. For single-channel testing, we’re benchmarking the best kit – the Trident Z CL14 3200MHz option – with one channel in operation.
Keep in mind that this is not a straight frequency comparison, e.g. not a 2400MHz vs. 3200MHz comparison. That’s because we’re changing timings along with the kits; basically, we’re looking at the whole picture, not just frequency scalability. The idea is to see how XMP with stock motherboard timings (where relevant) can impact performance, not just straight frequency with controls, as that is likely how users would be installing their systems.
We’ll show some of the memory/motherboard auto settings toward the end of the content.
As part of our new and ongoing “Bench Theory” series, we are publishing a year’s worth of internal-only data that we’ve used to drive our 2018 GPU test methodology. We haven’t yet implemented the 2018 test suite, but will be soon. The goal of this series is to help viewers and readers understand what goes into test design, and we aim to underscore the level of accuracy that GN demands for its publication. Our first information dump focused on benchmark duration, addressing when it’s appropriate to use 30-second runs, 60-second runs, and more. As we stated in the first piece, we ask that any content creators leveraging this research in their own testing properly credit GamersNexus for its findings.
Today, we’re looking at standard deviation and run-to-run variance in tested games. Games on bench cycle regularly, so the purpose is less for game-specific standard deviation (something we’re now addressing at game launch) and more for an overall understanding of how games deviate run-to-run. This is why conducting multiple, shorter test passes (see: benchmark duration) is often preferable to conducting fewer, longer passes; after all, we are all bound by the laws of time.
Looking at statistical dispersion can help understand whether a game itself is accurate enough for hardware benchmarks. If a game is inaccurate or varies wildly from one run to the next, we have to look at whether that variance is driver-, hardware-, or software-related. If it’s just the game, we must then ask the philosophical question of whether it’s the game we’re testing, or if it’s the hardware we’re testing. Sometimes, testing a game that has highly variable performance can still be valuable – primarily if it’s a game people want to play, like PUBG, despite having questionable performance. Other times, the game should be tossed. If the goal is a hardware benchmark and a game is behaving in outlier fashion, and also largely unplayed, then it becomes suspect as a test platform.
We already have a dozen or so content pieces showing that delidding can improve thermal performance of Intel CPUs significantly, but we’ve always put the stock Intel IHS back in place. Today, we’re trying a $20 accessory – it’s a CNC-machined copper IHS from Rockit Cool, which purportedly increases surface area by 15% and smooths out points of contact. Intel’s stock IHS is a nickel-plated copper block, but is smaller in exposed surface area than the Rockit Cool alternative. The Intel IHS is also a non-flat surface – some coldplates are made concave to match the convex curvature of the Intel IHS (depending on your perspective of the heat spreader, granted), whereas the Rockit Cool solution is nearly perfectly flat. Most coolers have some slight conformity to mounting tension, flattening out coldplates atop a non-flat CPU IHS. For this reason and the increased surface (and contact) area, it was worth trying Rockit Cool’s solution.
At $14 to $20, this was worth trying. Today, we’re looking at if there’s any meaningful thermal improvement from a custom copper IHS for Intel CPUs, using an i7-8700K and Rockit Cool LGA115X heat spreader.
Delidding the AMD R3 2200G wasn’t as clean as using pre-built tools for Intel CPUs, but we have a separate video that’ll show the delid process to expose the APU die. The new APUs use thermal paste, rather than AMD’s usual solder, which is likely a cost-saving measure for the low-end parts. We ran stock thermal tests on our 2200G using the included cooler and a 280mm X62 liquid cooler, then delidded it, applied Thermal Grizzly Conductonaut liquid metal, and ran the tests again. Today, we’re looking at that thermal test data to determine what sort of headroom we gain from the process.
Delidding the AMD R3 2200G is the same process as for the 2400G, and liquid metal application follows our same guidelines as for Intel CPUs. This isn’t something we recommend for the average user. As far as we’re aware, one of Der8auer’s delid kits does work for Raven Ridge, but we went the vise & razor route. This approach, as you might expect, is a bit riskier to the health of the APU. It wouldn’t be difficult to slide the knife too far and destroy a row of SMDs (surface-mount devices), so we’d advise not following our example unless willing to risk the investment.
APU reviews have historically proven binary: Either it’s better to buy a dGPU and dirt-cheap CPU, or it’s actually a good deal. There is zero room for middle-ground in a market that’s targeting $150-$180 purchases. There’s no room to be wishy-washy, and no room for if/but/then arguments: It’s either better value than a dGPU + CPU, or it’s not worthwhile.
Preceding our impending Raven Ridge 2400G benchmarks, we decided to test the G4560 and R3 1200 with the best GPU money can buy – because it’s literally the only GPU you can buy right now. That’d be the GT 1030. Coupled with the G4560 (~$72), we land at ~$160 for both parts, depending on the momentary fluctuations of retailers. With the R3 1200, we land at about $180 for both. The 2400G is priced at $170, or thereabouts, and lands between the two.
(Note: The 2400G & 2200G appear to already be listed on retailers, despite the fact that, at time of writing, embargo is still on)
We’re revisiting one of the best ~200mm-ish fans that existed: The SilverStone Air Penetrator 180, or AP181, that was found in the chart-topping Raven02 case that we once held in high regard. We dug these fans out of our old Raven, still hanging around post-testing from years ago, and threw them into a test bench versus the Noctua 200mm and Cooler Master 200mm RGB fans (the latter coming from the H500P case).
These three fans, two of which are advertised as 200mm, all have different mounting holes. This is part of the reason that 200mm fans faded from prominence (the other being replacing mesh side panels with a sheet of glass), as companies were all fighting over a non-standardized fan size. Generally speaking, buying a case with 200mm fans did not – and still does not – guarantee that other 200mm fans will work in that case. The screw hole spacing is different, the fan size could be different, and there were about 4 types of 200mm-ish fans from the time: 180mm, 200mm, 220mm, and 230mm.
That’s a large part of the vanishing act of the 200mm fans, although a recent revival by Cooler Master has resurrected some interest in them. It’s almost like a fashion trend: All the manufacturers saw at Computex that 200mm fans were “in” again, and immediately, we started seeing CES 2018 cases making a 200mm push.
The short answer to the headline is “sometimes,” but it’s more complicated than just FPS over time. To really address this question, we have to first explain the oddity of FPS as a metric: Frames per second is inherently an average – if we tell you something is operating at a variable framerate, but is presently 60FPS, what does that really mean? If we look at the framerate at any given millisecond, given that framerate is inherently an average of a period of time, we must acknowledge that deriving spot-measurements in frames per second is inherently flawed. All this stated, the industry has accepted frames per second as a rating measure of performance for games, and it is one of the most user-friendly means to convey what the actual, underlying metric is: Frametime, or the frame-to-frame interval, measured in milliseconds.
Today, we’re releasing public some internal data that we’ve collected for benchmark validation. This data looks specifically at benchmark duration or optimization tests to min-max for maximum accuracy and card count against the minimum time required to retain said accuracy.
Before we publish any data for a benchmark – whether that’s gaming, thermals, or power – we run internal-only testing to validate our methods and thought process. This is often where we discover flaws in methods, which allow us to then refine them prior to publishing any review data. There are a few things we traditionally research for each game: Benchmark duration requirements, load level of a particular area of the game, the best- and worst-case performance scenarios in the game, and then the average expected performance for the user. We also regularly find shortcomings in test design – that’s the nature of working on a test suite for a year at a time. As with most things in life, the goal is to develop something good, then iterate on it as we learn from the process.
To everyone’s confusion, a review copy of Dragon Ball FighterZ for Xbox One showed up in our mailbox a few days ago. We’ve worked with Bandai Namco in the past, but never on console games. They must have cast a wide net with review samples--and judging by the SteamCharts stats, it worked.
It’d take some digging through the site archives to confirm, but we might never have covered a real fighting game before. None of us play them, we’ve tapered off doing non-benchmark game reviews, and they generally aren’t demanding enough to be hardware testing candidates (recommended specs for FighterZ include a 2GB GTX 660). For the latter reason, it’s a good thing they sent us the Xbox version. It’s “Xbox One X Enhanced,” but not officially listed as 4K, although that’s hard to tell at a glance: the resolution it outputs on a 4K display is well above 1080p, and the clear, bold lines of the cel-shaded art style make it practically indistinguishable from native 4K even during gameplay. Digital Foundry claims it’s 3264 x 1836 pixels, or 85% of 4K in height/width.
Today, we’re using Dragon Ball FighterZ to test our new console benchmarking tools, and further iterate upon them for -- frankly -- bigger future launches. This will enable us to run console vs. PC testing in greater depth going forward.
We moderate comments on a ~24~48 hour cycle. There will be some delay after submitting a comment.