Revealed to press under embargo at last week’s GTC, the nVidia-hosted GPU Technology Conference, nVidia CEO Jensen Huang showcased the new TITAN W graphics card. The Titan W is nVidia’s first dual-GPU card in many years, and comes after the compute-focused Titan V GPU from 2017.
The nVidia Titan W graphics card hosts two V100 GPUs and 32GB of HBM2 memory, claiming a TDP of 500W and a price of $8,000.
“I’m really just proving to shareholders that I’m healthy,” Huang laughed after his fifth consecutive hour of talking about machine learning. “I could do this all day – and I will,” the CEO said, with a nod to PR, who immediately locked the doors to the room.
At GTC 2018, we learned that SK Hynix’s GDDR6 memory is bound for mass production in 3 months, and will be featured on several upcoming nVidia products. Some of these include autonomous vehicle components, but we also learned that we should expect GDDR6 on most, if not all, of nVidia’s upcoming gaming architecture cards.
Given a mass production timeline of June-July for GDDR6 from SK Hynix, assuming Hynix is a launch-day memory provider, we can expect next-generation GPUs to become available after this timeframe. There still needs to be enough time to mount the memory to the boards, after all. We don’t have a hard date for when the next-generation GPU lineup will ship, but from this information, we can assume it’s at least 3 months away -- possibly more. Basically, what we know is that, assuming Hynix is a launch vendor, new GPUs are nebulously >3 months away.
NVidia today announced what it calls “the world’s largest GPU,” the gold-painted and reflective GV100, undoubtedly a call to its ray-tracing target market. The Quadro GV100 combines 2x V100 GPUs via NVLink2, running 32GB of HBM2 per GPU and 10,240 CUDA cores. NVidia advertises 236 TFLOPS Tensor Cores in addition to the power afforded by the 10,240 CUDA cores.
Additionally, nVidia has upgraded its Tesla V100 products to 32GB, adding to the HBM2 stacks on the interposer. The V100 is nVidia’s accelerator card, primarily meant for scientific and machine learning workloads, and later gave way to the Titan V(olta). The V100 was the first GPU to use nVidia’s Volta architecture, shipping initially at 16GB – just like the Titan V – but with more targeted use cases. NVidia's first big announcement for GTC was to add 16GB VRAM to the V100, further adding a new “NV Switch” (no, not that one) to increase the coupling capabilities of Tesla V100 accelerators. Now, the V100 can be bridged with a 2-billion transistor switch, offering 18 ports to scale-up the GPU count per system.
NVidia’s Volta GV100 GPU and Tesla V100 Accelerator were revealed yesterday, delivering on a 2015 promise of Volta arrival by 2018. The initial DGX servers will ship by 3Q17, containing multiple V100 Accelerator cards at a cost of $150,000, with individual units priced at $18,000. These devices are obviously for enterprise, machine learning, and compute applications, but will inevitably work their way into gaming through subsequent V102 (or equivalent) chips. This is similar to the GP100 launch, where we get the Accelerator server-class card prior to consumer availability, which ultimately helps consumers by recuperating some of the initial R&D cost through major B2B sales.
We've not been shy in our fierce criticisms of VR from a gaming perspective, but the maturation of development has yielded increasingly more mechanically-focused titles targeted at gamers. Mars 2030 aims to be more than a “VR Experience,” as most titles are, and we had the opportunity to get hands-on with the new game at GTC 2016.
Mars 2030 is developed by Fusion and was first shown at the GTC keynote, the Mars rover helmed by industry icon Steve Wozniak. The open-world game takes place on the surface of Mars and deploys unique techniques to match surface color, heights, and physical interaction with terrain. It's playable on non-VR displays as well (and it does look good on 21:9 aspect ratios, based on the keynote), but hopes to stake its flag into the VR market with an agnostic disposition toward the Vive and Rift. Mars 2030 will work on both major devices.
Our hands-on impressions with Mars 2030 left us reasonably impressed with the early demonstration of Fusion's attempt to cast players as astronauts.
Pascal is the imminent GPU architecture from nVidia, poised to compete (briefly) with AMD's Polaris, which will later turn into AMD Vega and Navi. Pascal will shift nVidia onto the new memory technologies introduced on AMD's Fury X, but with the updated HBM2 architecture (High Bandwidth Memory architecture version 2); Intel is expected to debut HBM2 on its Xeon Phi HPC CPUs later this year. View previous GTC coverage of Mars 2030 here.
HBM2 operates on a 4096-bit memory bus with a maximum theoretical throughput of 1TB/s. HBM version 1, for reference, operated at 128GB/s per stack on a 1024-bit wide memory bus. On the Fury X – again, for reference – this calculated-out to approximately 512GB/s. HBM2 will double the theoretical memory bandwidth of HBM1.
After offering reddit's computer hardware & buildapc sub-reddits the opportunity to ask us about our nVidia GTC keynote coverage, an astute reader ("asome132") noticed that the new Pascal roadmap had a key change: Maxwell's "unified virtual memory" line-item had been replaced with a very simple, vague "DirectX 12" item. We investigated the change while at GTC, speaking to a couple of CUDA programmers and Maxwell architecture experts; I sent GN's own CUDA programmer and 30+ year programming veteran, Jim Vincent, to ask nVidia engineers about the change in the slide deck. Below includes the official stance along with our between-the-lines interpretation and analysis.
In this article, we'll look at the disappearance of "Unified Virtual Memory" from nVidia's roadmap, discuss an ARM/nVidia future that challenges existing platforms, and look at NVLink's intentions and compatible platforms.
(This article has significant contributions from GN Staff Writer & CUDA programmer Jim Vincent).
I figured I’d write a quick blog-style post for those of you who check the site regularly for convention coverage and new games/technology analysis. We’re coming off the tail-end of GDC, the Game Developers Conference, and GTC, the GPU Technology Conference, and will be heading to PAX East next. We had some of our best convention coverage yet at these two events, so as a quick recap, here are the must-read articles from each:
Day one of GTC saw the presentation of nVidia’s refreshed lineup of VisualFX SDK tools, including GameWorks, FaceWorks, HairWorks, WaterWorks, and other *Works software. These Software Developer Kits are used in aiding the development of games, including the optimization of graphically-intensive dynamic elements, like fur, fire, and smoke. Graphics and CPU technology companies are often behind many of the game industry’s visual advancements, but as PC gamers, we don’t see much of it actually in-game for several years. Development time is part of this, adoption is part of this, and consoles are responsible in part.
Let’s look at some of nVidia’s more recent changes for character faces, real-time smoke and fire effects, pre-baked lighting effects, subsurface scattering, deep surface scattering, and fur/hair technology. It seems pertinent to recommend watching Epic’s Unreal Engine tech demo as well, since it utilizes many of these technologies and render methods; you can read our full article on UE4 here.
In a somewhat tricksy move today, AMD hosted a press conference a couple of miles from nVidia’s active GTC event going on down the road. In yesterday’s keynote by nVidia CEO Jen-Hsun Huang, we saw the introduction of the new Titan Z video card, Pascal architecture, machine learning, and other upcoming GPU technologies. Now, less than 24 hours later, AMD has invited us by to look at their new high-end workstation solution – the W9100 FirePro GPU.
The presentation was pretty quick compared to what we got with nVidia, but the primary focus was on computationally-intensive OpenCL tasks, real-time color correction and editing playback in full 4K resolution, and “enabling content creation.”
Let’s start with the obvious.
We moderate comments on a ~24~48 hour cycle. There will be some delay after submitting a comment.