VR Benchmarking Tackled: 5 Months in the Making

By Published March 01, 2017 at 8:30 am
  •  

Not long ago, we opened discussion about AMD’s new OCAT tool, a software overhaul of PresentMon that we had beta tested for AMD pre-launch. In the interim, and for the past five or so months, we’ve also been silently testing a new version of FCAT that adds functionality for VR benchmarking. This benchmark suite tackles the significant challenges of intercepting VR performance data, further offering new means of analyzing warp misses and drop frames. Finally, after several months of testing, we can talk about the new FCAT VR hardware and software capture utilities.

This tool functions in two pieces: Software and hardware capture.

 

The hardware capture is conceptually easy, but does require correct connection order of about 8 cables for the Vive. Hardware capture requires a secondary capture system equipped with high-speed storage – we’re using an Intel 750 1TB SSD provided by BS Mods – and equipped with a Vision SC-HD4+ capture card that’s capable of accepting high bandwidth from VR headsets. The video card (DUT) on the test system outputs via DisplayPort to the monitor, then via HDMI to a splitter box. The box splits the signal to the link box for the HMD, which then forwards to the VR headset. The other output splits to the capture system’s splitter, which feeds into a capture card. That then uses VirtualDub to capture the display output from the VR headset.

vr-benchmarking-fcat-2

vr-benchmarking-fcat-3

If you’re still following, that’s quite a few splitters. Connection order also matters, and is something we’ll discuss in our full content piece later this month. That content piece will also include benchmarks, as you’d expect, with a visual walkthrough of the setup behind the scenes.

While all of this is going on, we run color overlays that allow us to analyze frame-by-frame to ensure that there’s no loss resultant of the capture methodology. A second overlay is used to validate the framerate output to the headset, which would be exactly 90FPS in a perfect world given the forced V-sync of VR. We can then record these videos, analyze them frame-by-frame, and use a GN-made compression script to shrink the files down. Natively, each captured file can easily run upwards of 50GB for about a minute. Our compression routine allows the quality you see in our video explaining this tech (fairly high), while also shrinking file sizes by more than 80%.

vr-benchmarking-fcat-data

Software capture, meanwhile, captures dozens of variables in flight so that we can look at the many latencies and hitch points in VR frame output. We can also extrapolate the maximum FPS capabilities were v-sync disabled. Keep in mind that VR does hard cap at 90FPS, so an algorithm must be used to extrapolate synthetic FPS if in excess of 90. Capture is primarily useful for FPS below the 90FPS range, where we’re able to leverage tools to monitor warp misses and frame drops. As a reminder from previous content, warp misses are when an old frame is replayed without modification. This would happen under circumstances where the runtime doesn’t receive the frame in time (11ms), and so must decide what to do next. Replaying an old frame without modification is analogous to a stutter in traditional gameplay, with the detrimental effect of impacting user nausea or stability. Drop frames are when a new frame must be synthesized, taking an old frame and updating the head position, but leaving animation the same as the previous frame. We lose updates to game events in this way, but retain head movement and should reduce chance of VR sickness.

Both of these have to be monitored, and traditional benchmarking tools do not allow it. Steam VR has some utilities that can do this monitoring, just without the robust capabilities we require.

The software has come a long way since we first started work with it. NVidia’s team has done well to iterate in a way that makes FCAT VR software capture more intuitive in its usability, whereas early iterations required hand-tuning of perl scripts, careful file structuring, and regex (regular expression) editing for data alignment and cropping. Where we once had to manually align data by modifying perl scripts, we are now able to drag-and-drop logged CSV files into a Python interface. New delay and expiry options have also been added to VR software benchmarking, features which GN requested for faster test execution.

Traditional benchmarking tools like FRAPS and PresentMon do not work as stand-ins for VR benchmarking, as neither accounts for unique behaviors of VR. Frame reprojection, warp misses, and drop frames mean that we need to know when a frame has completely stuttered and frozen, when head tracking was updated, and when animation is updated. All totaled, this provides a more complete understanding of the user experience from within the headset. We’re able to quantify the data using these methods, and therefore able to better demonstrate the type of framerate behavior that causes user uneasiness.

This is still an nVidia-made tool, of course, so we must validate it to ensure fairness in benchmarking. Hardware capture aids in this regard, as we can align our SW + HW data to determine accuracy in benchmarking. Because HW capture is a direct recording of what’s presented on the HMD, and is then fed into Virtual Dub (and other tools) to cross-reference for fairness. NVidia plans to release FCAT VR in open source format, and will release the software capture tools to the public post-reviewer analysis.

We can’t release all of our data just yet, but will be doing so soon. Expect a full benchmark of several video cards using the new VR tools within the next few weeks. We have been working with VR benchmarking for around 5 months now, and just have to make the final push to validate all the data for accuracy. Of course, nVidia has been working on this for years -- but we've had a glimpse at some of the development process in recent history.

Check the video for some additional discussion on how this works, alongside captured VR data (with the overlays on screen) and a screen cap of the tool.

Regarding both OCAT and FCAT VR, we will continue working with both AMD and nVidia on their tools as they progress.

Editorial: Steve Burke
Video: Keegan Gallick

Last modified on February 28, 2017 at 8:30 am
Steve Burke

Steve started GamersNexus back when it was just a cool name, and now it's grown into an expansive website with an overwhelming amount of features. He recalls his first difficult decision with GN's direction: "I didn't know whether or not I wanted 'Gamers' to have a possessive apostrophe -- I mean, grammatically it should, but I didn't like it in the name. It was ugly. I also had people who were typing apostrophes into the address bar - sigh. It made sense to just leave it as 'Gamers.'"

First world problems, Steve. First world problems.

  VigLink badge