Recent advancements in graphics processing technology have permitted software and hardware vendors to collaborate on real-time ray tracing, a long-standing “holy grail” of computer graphics. Ray-tracing has been used for a couple of decades now, but has always been used in pre-rendered graphics – often in movies or other video playback that doesn’t require on-the-fly processing. The difference with going real-time is that we’re dealing with sparse data, and making fewer rays look good (better than standard rasterization, especially) is difficult.
NVidia has been beating this drum for a few years now. We covered nVidia’s ray-tracing keynote at ECGC a few years ago, when the company’s Tony Tamasi projected 2015 as the year for real-time ray-tracing. That obviously didn’t fully realize, but the company wasn’t too far off. Volta ended up providing some additional leverage to make 60FPS, real-time ray-tracing a reality. Even still, we’re not quite there with consumer hardware. Epic Games and nVidia have been demonstrating real-time ray-tracing rendering with four Titan V100 GPUs lately, functionally $12,000 worth of Titan Vs, and that’s to achieve a playable real-time framerate with the ubiquitous “Star Wars” demo.
At GTC 2018, we learned that SK Hynix’s GDDR6 memory is bound for mass production in 3 months, and will be featured on several upcoming nVidia products. Some of these include autonomous vehicle components, but we also learned that we should expect GDDR6 on most, if not all, of nVidia’s upcoming gaming architecture cards.
Given a mass production timeline of June-July for GDDR6 from SK Hynix, assuming Hynix is a launch-day memory provider, we can expect next-generation GPUs to become available after this timeframe. There still needs to be enough time to mount the memory to the boards, after all. We don’t have a hard date for when the next-generation GPU lineup will ship, but from this information, we can assume it’s at least 3 months away -- possibly more. Basically, what we know is that, assuming Hynix is a launch vendor, new GPUs are nebulously >3 months away.
NVidia today announced what it calls “the world’s largest GPU,” the gold-painted and reflective GV100, undoubtedly a call to its ray-tracing target market. The Quadro GV100 combines 2x V100 GPUs via NVLink2, running 32GB of HBM2 per GPU and 10,240 CUDA cores. NVidia advertises 236 TFLOPS Tensor Cores in addition to the power afforded by the 10,240 CUDA cores.
Additionally, nVidia has upgraded its Tesla V100 products to 32GB, adding to the HBM2 stacks on the interposer. The V100 is nVidia’s accelerator card, primarily meant for scientific and machine learning workloads, and later gave way to the Titan V(olta). The V100 was the first GPU to use nVidia’s Volta architecture, shipping initially at 16GB – just like the Titan V – but with more targeted use cases. NVidia's first big announcement for GTC was to add 16GB VRAM to the V100, further adding a new “NV Switch” (no, not that one) to increase the coupling capabilities of Tesla V100 accelerators. Now, the V100 can be bridged with a 2-billion transistor switch, offering 18 ports to scale-up the GPU count per system.
The latest Ask GN brings us to episode #70. We’ve been running this series for a few years now, but the questions remain top-notch. For this past week, viewers asked about nVidia’s “Ampere” and “Turing” architectures – or the rumored ones, anyway – and what we know of the naming. For other core component questions, Raven Ridge received a quick note on out-of-box motherboard support and BIOS flashing.
Non-core questions pertained to cooling, like the “best” CLCs when normalizing for fans, or hybrid-cooled graphics VRM and VRAM temperatures. Mousepad engineering got something of an interesting sideshoot, for which we recruited engineers at Logitech for insight on mouse sensor interaction with surfaces.
More at the video below, or find our Patreon special here.
Hardcore overclocker "Buildzoid" just finished his VRM and PCB analysis of the Titan V, released on the GN channel moments ago. The Titan V uses a 16-phase VRM from nVidia with an interesting design, including some "mystery" in-line phases that we think are used to drop 12v. This VRM is one of the best that nVidia has built on a 'reference' card, and that makes sense, seeing as there won't be other Titan V cards from board partners. We do think the cooling solution needs work, and we've done a hybrid mod to fix that, but the VRM and PCB put us in a good place for heavier modding, including shunt modding.
Shunt modding is probably the most interesting, as that's what will give a bit more voltage headroom for overclocking, and should help trick the card's regulation into giving us more power to play with. Buildzoid talks about this mod during the video, for any willing to attempt it. We may attempt the mod on our own card.
We took our nVidia Titan V Volta card apart when we first received it, following our gaming benchmarks, and are now embarking on a mission to take some Top 10 scores in HWBot Firestrike rankings. Admittedly, we can only get close to top 10 from access – we bought the card early, and so it’s a bit of an unfair advantage – but we’re confident that the top 10 slots will soon belong entirely to the XOC community.
For now, though, we can have a moment of glory. If only a moment.
Getting there will require better cooling, as we just aren’t as good at CPU overclocking as some of the others in the top 10. To make up for our skill and LN2 deficit, we can throw more cooling at the Titan V and put up a harder fight. Liquid cooling the V is the first step, and will help us stabilize higher clocks at lower temperatures. Volta, like Pascal, increases its clock (and the stability of that clock) as the GPU core temperature decreases. Driving temperatures down under 60C will help tremendously in stability, and driving them under 40C – if possible – will be even better. We’ll see how far we get. Our Top 10 efforts will be livestreamed at around 5 or 6PM EST today, December 16, 2017.
This test is another in a series of studies to learn more about nVidia’s new Volta architecture. Although Volta in its present form is not the next-generation gaming architecture, we would anticipate that key performance metrics can be stripped from Volta and used to extrapolate future behavior of nVidia’s inevitable gaming arch, even if named differently. One example would be our gaming benchmarks, where we observed significant performance uplift in games leveraging asynchronous compute pipelines and low-level APIs. Our prediction is that nVidia is moving toward a future of heavily support asynchronous compute job queuing, where the company is presently disadvantaged versus its competition; that’s not to say that nVidia doesn’t do asynchronous job queuing on Pascal (it does), but that AMD has, until now, put greater emphasis on that particular aspect of development.
This, we think, may also precipitate more developer movement toward these more advanced programming techniques. With the only two GPU vendors in the market supporting lower level APIs and asynchronous compute with greater emphasis, it would be reasonable to assume that development would follow, as would marketing development dollars.
In this testing, we’re running benchmarks on the nVidia Titan V to determine whether GPU core or memory (HBM2) overclocks have greater impact on performance. For this test, we’re only using a few key games, as selected from our gaming benchmarks:
- Sniper Elite 4: DirectX 12, asynchronous compute-enabled, and showed significant performance uplift in Volta over Pascal. Sniper responds to GPU clock changes in drastic ways, we find. This represents our async titles.
- Ashes of the Singularity: DirectX 12, but less responsive than Sniper. We were seeing ~10% uplift over the Titan Xp, whereas Sniper showed ~30-40% uplift. This gives us a middle-ground.
- Destiny 2: DirectX 11, not very responsive to the Titan V in general. We saw ~4% uplift over the Titan Xp at some settings, though other settings combinations did produce greater change. This gives us a look at games that don’t necessarily care for Volta’s new async capabilities.
We are also using Firestrike Ultra and Superposition, the latter of which is also fairly responsive to the Titan’s dynamic ray-casting performance.
We are running the fan at 100% for all tests, with the power offset at 120% (max) for all tests. Clocks are changed according to their numbers in the charts.
As we work toward our inevitable hybrid mod on the nVidia Titan V, we must visit the usual spread of in-depth thermal, power, and clock behavior testing. The card uses a slightly modified Titan Xp cooler, with primary modifications found in the vapor chamber’s switch to copper heatfins. That’s the primary change, and not one that’s necessarily all that meaningful. Still, the card needs whatever it can get, and short of a complete cooler rework, this is about the most that can fit on the current design.
In this Titan V benchmark, we’ll be looking at the card’s power consumption during various heavy workloads, thermal behavior of the MOSFETs and GPU core, and how frequency scales with thermals and power. The frequency scaling is the most important: We’ve previously found that high-end nVidia cards leave noteworthy performance (>100MHz boost) on the table with their stock coolers, and suspect the same to remain true on this high-wattage GPU.
The nVidia Titan V is not a gaming card, but gives us some insights as to how the Volta architecture could react to different games and engines. The point here isn’t to look at raw performance in a hundred different titles, but to think about what the performance teaches us for future cards. This will teach us about the Volta architecture; obviously, you shouldn’t be spending $3000 to use a scientific card on gaming, but that doesn’t mean we can’t learn from it. Our tear-down is already online, but now we’re focusing on Titan V overclocking and FPS benchmarks, and then we’ll move on to production, power, and thermal content.
This nVidia Titan V gaming benchmark tests the Volta architecture versus Pascal architecture across DirectX 11, DirectX 12, Vulkan, and synthetic applications. We purchased the Titan V for editorial purposes, and will be dedicating the next few days to dissecting every aspect of the card, much like we did for Vega: Frontier Edition in the summer.
This episode of Ask GN headlines with answering the most common question we’ve seen in the past 24 hours: Should I buy now or wait for Volta? That’ll start us off for this episode, followed by clarification of VRM quality, a history lesson on AM4 motherboards at launch and HIS existence, and silicon death from overclocking. This episode runs about 25 minutes, with each question timestamped within the video. We also have the timestamps and questions marked below, if you’d like to see when a particular topic of interest appears.
The Volta topic, we think, is among the most interesting and common for questions right now. This seems to come around for every new architecture, and our answers are generally the same. Find out more below!
We moderate comments on a ~24~48 hour cycle. There will be some delay after submitting a comment.