What We've Been Told: PS4's Top-Level Hardware Specs
The PlayStation 4's announcement a few days ago came with a surprising amount of hardware-related information, despite being lacking in deep information -- we normally don't get much hardware info at all at these press conferences. So far, we've been told that the PS4 will utilize an 8-core Jaguar APU that has been modified by AMD to fit Sony's spec (the main customization being that the PS4 will be 8-core; Jaguar is normally a 4-core unit), 8GB of VRAM-speed GDDR5 memory (we'll talk about latency below), a dedicated chip for background processes, "at least" a spindle hard drive (maybe insinuating the potential for a hybrid drive or SSD?), and Blu-Ray optical capabilities.
On the ACPI front, Sony's biggest gains have been in the implementation of custom S3/S4 advanced power-saving states, resulting in similar advantages to what we see in modern PC ACPI options. The big difference here is that Sony's custom sleep states are intended to preserve a game state without requiring a save/power-down (though we'd obviously always recommend saving), which should theoretically bypass loading and logo splash-screens upon resume. So it's the more traditional usage of "suspend," though the PS4 will also continue its downloads and presumably uploads while in this hybrid sleep state by using a background processor.
Downloading has had several tweaks to ensure a consistent data-stream and reduced wait-time for users, like the same intelligent patching we see in modern MMORPGs (Rift and WoW both examples). In a game like Rift, large patches can be broken into two chunks: "Required to play" and "everything else," an example might be add-on textures and models being broken apart from required zone elements and ability icons/text. This allows the player to download only a portion of an 8GB patch, play the game, and then download the rest at a later, more convenient date (like overnight). Sony's doing something similar, allowing players to download what's required to play the first "X chapters," we'll call it, of the game and download the rest in the background.
The secondary chip that's dedicated to suspend-state maintenance and background processing should, when coupled with the partial download requirements, allow for a more steady transition into digital large-format media consumption in the next console era. Physical media will stick around for a while yet, and Sony has already announced that it has no intention to "block used game sales," but we suspect that downloads will rule the day soon enough. Just look at the PC platform - we've been stuck with Steam, GamersGate, GoG, and the like for years now, whether we like it or not. There are still a few discs in the world, but they're increasingly difficult to procure.
Time to talk about Jaguar. We normally focus on hardware specs that are slated for PC consumption, but the PS4 (being the first serious attempt of the next-gen consoles) is an important discussion topic for everyone: There's no debate that console and PC gaming are intertwined to the point of impacting one another, and the way next-gen units will handle gaming tasks will dictate heavily the efficiency with which PCs can run ported games -- and vice-versa.
Jaguar: A Modern CISC PC Architecture x86 APU... in a console
Console developers have historically (almost) always taken a loss on hardware sales, opting instead to recoup their losses on licensing fees made from game software sales (queue the rise of game prices across markets). Sony looks like it's trying to be a bit more conservative this time with its hardware losses, and while most console CPUs have never been all that impressive, the PS4 really doesn't deliver much that we haven't already seen. The Jaguar is an impressive APU, to be sure, but the GPU component is perhaps a bit more noticeable than the CPU component.
First, it's important to note that Jaguar fits in with modern desktop architecture by running on an x86 CISC (complex instruction-set computing) platform (64-bit ready) rather than the less-impressive, low transistor-count RISC (reduced instruction-set computing) chips we saw in the PS3 and Xbox 360 (the original Xbox had an x86 CPU). The slimmed-down RISC chips can be optimized to produce reliable performance for gaming applications at higher profits, but as consoles merge use-case scenarios with traditional PC uses and trend toward more universally-standardized gaming, the need for a complex CPU architecture becomes more evident. This is beneficial for PC gamers for a number of reasons, all of which we'll discuss below in a dedicated tear-down section.
The custom Jaguar APU is an interesting chip in its own right -- not necessarily powerful, but interesting, though we're still awaiting a die-shot. There have been a lot of changes between Jaguar and its Brazos predecessors -- even leveraged technologies from the expensive Llano chips -- but the most notable improvement is in the FPU realm: Jaguar's FPU (128-bits) is double the width of its championed Bobcat FPU (64-bits), providing a wider data-path for floating-point calculations than previously and offering more tangible computational prowess.
The traditional Jaguar unit can process two instructions per clock cycle, though we're unsure of whether this remains true in the modified PS4 Jaguar; it also has 4x32-byte loop buffer modules that work to reduce strain placed on the decoders by processing repeat tasks without the need for additional decode cycles. These items, along with several others of similar caliber, are what make Jaguar such an efficient chip. Jaguar is also slated to operate somewhere within the relatively vague 4.5W - 17W TDP range (we suspect sub-10W in the PS4) and will run on an external clock of about 2GHz, though the relatively low speed may prove to be largely irrelevant for gaming.
The keyword here is "traditional," as we're unsure of how much of this has changed in the "semi-custom" PS4 version of the APU.
Perhaps more immediately interesting is the GPU component of the custom Jaguar APU, which has been promised to pump 1.84TFLOPS of raw compute power (18 compute units) into the PS4's veins. This promising amount of power should place the PS4's raw-power equivalence somewhere between the 7850 and 7870 desktop GPUs (~1.75TFLOPS and ~2.56TFLOPS, respectively). Granted, it's not quite as simple as comparing spec-for-spec when considering the inherent direct-to-metal advantage that consoles hold over PCs (discussed below).
The Jaguar GPU runs at 800MHz (extrapolated from memory metrics) and the 8GB of GDDR5 memory functions at an effective 5.5GHz frequency; the higher latency is offset by its increased 176GB/s bandwidth, effectively eliminating the relevance of the latency (to be fair, DDR modules had something in the range of 2-2-2-5 latency, but no sane person would trade DDR3 for DDR over latency concerns). Also packed into the GPU are 18 unified compute units that can be allocated as resources to where developers choose, with no commanding Sony requirements to split CUs bearing down on the devs. On traditional APUs, the GPU's effectiveness is largely dependent upon the frequency and capacity of the system memory; this is because—unlike a dedicated video card—the APU loses the possibility for on-board, high-speed, close-proximity memory and must instead hit system memory for every instance where VRAM would normally be accessed. The PS4 eliminates system memory in the standard sense and instead opts for unified, high-speed, shared resources, mitigating one of the most concerning obstacles of APUs.
With a somewhat mid-range GPU component, decent memory, and a reasonable CPU, the biggest bottleneck for the PS4 will be present in its optical and non-volatile storage media. Optical is far-and-away the slowest link in the chain (8x DVDR, 6x BDR) -- most games will likely be installed to the system for this reason, with the dedicated background processor managing file allocation and installation procedures, hopefully freeing up the primary unit for gameplay. With the bypass of optical media as a bottleneck, we're next left with the storage unit slowing down the components. As all of us here know, storage will only "slow" things down insofar as it requires more time to perform necessary IO operations, meaning load times and access requests will take as long as we've become accustomed to on spindle drives. The implementation of a hybrid drive or an SSD would drastically reduce overall transactional latency and significantly diminish load times and improve the system's capacity for level streaming (rather than load screens), all by improving the speed of storage hits to refresh memory with new, relevant data and again mitigate the potential to cache-out.
Unfortunately, as it stands now, it looks like Sony has decided to simply state "at least a hard drive" as its storage device, potentially teasing upgrades in the future. We'd expect future upgrades to an SSD are highly probable, namely due to the fact that it won't impact the system's ability to play games or render graphics beyond the initial system, but should reduce R/W time requirements. This means there wouldn't be a disparity between PS4 users, but would certainly be gains to be had in an SSD option if made available.
All-in-all, the specs seem reasonably-priced and about on-par with what we'd expect from a console in this generation. The development lifecycle of new CPU architecture can be upwards of 36 months (2 years of development/design, the rest in fabrication - which takes surprisingly long - and testing/regression), the development of a new console would be equal to or longer than this period, so it makes sense that the PS4 is running on what will become last-gen graphics technology upon its launch (December, at which point we'll see new GPUs on the market in addition to the Titan). The CPU's speed isn't impressive, but shouldn't pose too large of a threat to quality with the GPU effectively becoming the centerpiece of game processing.
The Trouble with Console and PC Co-Optimization
We're not here to judge whether you prefer consoles or PCs (and we firmly believe both have different applications), but we are here to analyze the implications of the hardware used in both options as it pertains to gaming. There's no argument that cross-platform titles need to be ported one way or another, it's not a two-way street, and the platform that receives the "port" version is often hindered in performance, controls, and/or visuals.
The main reasons for this involve the multifarious hardware configurations out there. With consoles, developers have one subset of hardware to optimize for; they know what gamers will be playing on and they learn tricks to fit their games into that equipment. New file compression algorithms were utilized on the limited optical media used on the past few generations of consoles, using objects in new ways (example: Skyrim's use of a bookshelf as a table, below) to limit resource consumption, and improving prebaked environmental FX to appear falsely dynamic. Limits will be overcome when there's a guarantee that all of your target market is running the same hardware.
The PS4—or any console, for that matter—cannot be linearly compared spec-for-spec with PCs for this reason. Due to vast differences in programming methodologies and the API bypass granted by using the direct-to-metal option present with consoles, consoles will simply always have more optimized hardware to the point where same-generation PC hardware will struggle. It is an indisputable fact that any software (not just games) developed for a specific video card, specific CPU, and predetermined amount of RAM will perform significantly better on those components than it will for lesser-optimized options. This is as true for PCs as it is for consoles -- if Autodesk develops Maya exclusively spec'd for Quadro FX and FirePro video cards (which operate on similar silicon to their GTX and Radeon counterparts, albeit refined for professional tasks), then Maya should theoretically run disproportionately better (a non-linear gain) on those cards than other, even similarly-architected cards.
Game developers for consoles have direct access to the hardware and run much closer to the wire, allowing them to better tweak the execution of their games on that platform; further, they're able to bypass the rather clunky and obstructive operating system and API overhead present on desktop systems (Direct3D and Windows do wonders to degrade game performance, for example, and the Intel Compiler used to restrict AMD CPUs to sub-optimal codepaths that artificially degraded performance). Until Microsoft grants developers the same direct-to-metal access, we're stuck to rely on nVidia and AMD for releasing support patches.
With that, let's take drivers as an example to put this in perspective: Updating the software drivers for your video card may result in measurable gains in performance for certain games, sometimes to the point of resolving performance differentials that entirely prevented smooth framerates (and playability, as a result). The significance of this is fairly clear: Just having a ton of raw power doesn't necessarily translate into utilized power in a system, and with so many diverse hardware arrays on the market (Steam's hardware survey illustrates the massive disparity in hardware -- just 0.8% of participants run on a 7850, which we consider standard). That's where the driver updates come in - for popular games, Skyrim and Crysis 3 being recent examples, driver devs can often squeeze out unexpectedly high-yield performance improvements with specific arrays; just recently, nVidia overhauled its driver support for Crysis 3 and improved SLI performance with the game by roughly 30%. Single-card performance was also improved, though to a lesser degree.
Cards eventually get tossed into the "legacy" category and updates cease to roll-out, of course, which isn't something that happens with the static console hardware. This typically isn't for many, many years, of course. Steam Box could potentially present a threat to the stratification in the current console market for this reason (modularity is king, yet it still retains the simplicity that most console advocates desire), but that's a completely different story for a different post.
The task of optimization and game-specific support is often relegated to the hardware development and driver teams (as opposed to in-house game dev teams). We simply can't expect game developers to test and develop for every card out there, and when it's so easy to look at a console and its uniformity, can you really blame them? A large portion of the oft-multi-year development is spent achieving feats on aging hardware that would never be thought possible otherwise, yet the team responsible for porting will test for the most recent two (maybe three) generations of PC hardware, so the age of equipment shows much more rapidly than its raw power would suggest. Just because a semiconductor has a lot of transistors doesn't guarantee high-yield performance, though obviously on some levels brute-force is relevant.
It all boils down to time and money, but there's good news for everyone—yes, even for PC gamers (to whom we obviously dedicate almost all of our content)—and it comes in the form of the x86 architecture and higher RAM capacity. Consoles are computers, after all, and the intertwined relationship between the PC and its console brethren means the two impact each other in very direct ways.
The Good News: x86 Chips Provide More Universally-Supported Games
Now that we're finally moving away from the Power PC RISC architecture and into a more complex framework—similar to what we already use on all gaming PCs—developers will be much closer to what PC gamers use natively. That means we'll benefit indirectly from optimization tricks played on the Jaguar platform (things are looking promising for AMD users) and console-bound game devs should be closer to "our reality," rather than occupying their own, entirely different realm of computing.
The higher RAM capacity also brings great promise to the merging worlds of gaming: 8GB of UMA GDDR5 shared memory (effective 5.5GHz clock) will accommodate higher-resolution textures (like those found in Skyrim's 4K texture mods, for instance, or natively in Crysis 3), larger cells / "zones" within games to offer a more seamless experience, more objects, and will kindly complement the 1.84 TFLOP raw compute power. Most immediately noticeable will certainly be the high-resolution textures, which require the GPU to draw millions of pixels per texture (good thing we're running in parallel, here) though high-poly models should, to a lesser degree, also become more prevalent. The relatively high-power of the GPU component also means better post-processing capabilities and utilization of modern graphics technologies, like SSAO / HDAO, tessellation, volumetric particle effects -- the works.
Powered between the 7850 and 7870, 1.84TFLOPS isn't as bleeding edge as we'd like to see as PC gamers (because this directly impacts the level of graphics we'll see in ported games), but is expected given the development lifecycle of the system.
An eight-core AMD processor should prove promising for the advancement of multithreading optimization and, if the silicon stars align, maybe even AMD's desktop chips -- though we don't have much to base that optimism upon at the moment. Regardless, it is likely that multicore CPUs will become much more important as gaming evolves on the upcoming generation of consoles, especially with games like Star Citizen pushing the PC side of things.
It'll be a few years yet before the developers have come to grips with the new hardware, but there's certainly a promising joint-future ahead for PC gaming and console gaming. We look forward to running refreshed performance benchmarks on cross-platform games released in the future to follow-up with these expectations.
If you have questions about the architecture or have specific questions pertaining to the implications for PC gaming, please feel free to comment below and we'll help you out. We also welcome prospective system builders to our forums.
- Steve "Lelldorianx" Burke with analytical support from Tim "Space_man" Martin.