Let's Settle This: Kingston V300 Asynchronous vs. Synchronous NAND Benchmark

Written by  Monday, 17 March 2014 11:07
  •  

 

Test Methodology and Platform

We have a brand new test bench that we assembled for the 2013-2014 period! Having moved away from our trusty i7-930 and GTX 580, the new bench includes the below components: 

GN Test Bench 2013 Name Courtesy Of Cost
Video Card XFX Ghost 7850 GamersNexus ~$160
CPU Intel i5-3570k CPU GamersNexus ~$220
Memory 16GB Kingston HyperX Genesis 10th Anniv. @ 2400MHz Kingston Tech. ~$117
Motherboard MSI Z77A-GD65 OC Board GamersNexus ~$160
Power Supply NZXT HALE90 V2 NZXT Pending
SSD Kingston 240GB HyperX 3K SSD Kingston Tech. ~$205
Optical Drive ASUS Optical Drive GamersNexus ~$20
Case NZXT Phantom 820 NZXT ~$220
CPU Cooler Thermaltake Frio Advanced Cooler Thermaltake ~$60

We used a full suite of real-world trace-based tests (timed) that were designed in-house; we also used a suite of standard synthetic benchmarking applications, all designed to hit different parts of the device for a better understanding of its ideal use-case scenarios. All power saving, ACPI, and Intel speed stepping / CLK altering BIOS and Windows settings were disabled for this test. Windows 7 Professional 64-bit was used as the host OS. All devices were secure erased and preconditioned prior to initialization. The CPU was clocked at 4.4GHz with a 2.265 vCore.

I'm not going to go as heavy on the methodology as I did with our dual- vs. single-channel RAM platform test, but that's because we've got several very detailed, dense articles coming about SSD testing. It's also because I've admittedly got a flight to catch for GDC & GTC in the next few hours. Still, I do want to at least explain what each test means to you as a buyer or onlooker.

Note: All tests were run a minimum of 3 times for averaging and normalization of data; outliers, if present, were eliminated and retested.

Preconditioning is a Must 

We've discussed preconditioning and overprovisioning previously with Kent Smith of LSI and it's back for this post. It is of paramount importance that a tester precondition the SSD for the incoming onslaught of tests. Benchmarking an SSD out-of-box or after a secure erase will produce numbers that are wildly optimistic and entirely unsustainable as the drive gets "worn-in." The SandForce controller in these devices will react to the type of data being written to them, so we preconditioned for random and sequential testing separately.

In order to ensure the entire device is hit during our preconditioning round, we use a very simple formula to calculate the initial IOMETER 08 pass: drive capacity * .5 minutes. For these tests, we used 120GB SSDs with identical variables aside from the NAND (in theory, unless something else was switched that I don't know about). That means preconditioning was run for 120 * .5 = 60 minutes. The 100% write random preconditioning is run with 4K files at a 32 queue depth (QD) with 4K alignment.

I want to illustrate a point here, so I'm going to show the out-of-box results for the V300 against the preconditioned results. The second screenshot is from an in-progress random preconditioning pass, the first is from an initial 5-minute test before any preconditioning took place.

gn-pre-precon gn-precon

Pretty substantial differences. If I took the first pass for each drive, we'd have inconsistent results that are not reproduce-able after burn-in. In a real-world use case, you'll see your performance fall off a cliff after the initial hour-or-so of use, and this is why that happens; this applies to all SSDs in some capacity. With Kingston and Samsung -- and possibly others, though I haven't talked to them -- all their advertised IOPS and MB/s ratings are reported after preconditioning, so your mind can be put somewhat at ease.

After preconditioning, all IOMETER tests were run for 5 minutes with the settings shown in the respective charts.

What is Incompressible Data? 

You'll see the term 'incompressible data' thrown around a lot as it pertains to testing using Crystal Disk Mark or AS-SSD. Let's define it.

Compression algorithms are used in almost every application a user would be faced with and are used constantly within Windows. These compression algorithms work by looking for redundancies and patterns within data. Less random data (more predictable) is more compressible, because the algorithms can identify patterns and losslessly compress the stream. The effectiveness of the compression is amplified by some controllers, like the Gen 2 and Gen 3 SandForce controllers, which reduce the Write Amplification Factor by compressing data written to the disk in such a way that it is sometimes lower than 1x. Every time you write to NAND, you're killing the device in a very small way by using program cycles. Using such heavy compression on the controller-side means that each write leaves a smaller footprint, extending the endurance of the device because we're writing less data to the disk.

More random data is difficult or impossible to compress; the nature of randomness is that it has no pattern that can be modeled and thus cannot be mapped to a probability distribution. An example of (hopefully) purely random data would be file encryption. The very definition of encryption is that it produces no identifiable pattern, and so encrypted data often cannot be compressed using compression algorithms or SSD controller technology.

This isn't where incompressible data stops, though. Data that has already been heavily-compressed is seen by the SSD as "incompressible," because its redundancies have already been eliminated in the earlier compression pass to make the file smaller. A real-world example could be textures in a video game, where the game's creators have already compressed the data to allow for the smallest possible distribution size; as far as your SSD is concerned, this is incompressible data that cannot be made any (or much) smaller than the source file. Similarly, media formats like MP3s and MP4s or AVIs are almost always incompressible given the nature of how they are written when originally created.

If you're copying, writing, and reading certain types of incompressible media files on a massive scale (MP3, MP4, AVI, especially), you should care more about incompressible metrics. If you're a media production professional, you should be hugely concerned about sequential and incompressible benchmarks. For most users, we care about random operations with roughly 46% compression (to account for media files and similar data), since most operations will be compressible.

And all of this is why we never rely upon a single synthetic benchmark for analysis. Even if you're constantly moving music and streaming videos or playing games, it is probable that the majority of your data use will be of a compressible nature. Just look at Windows alone: the constant background logging and processes generally land in the 1K-10K filesize range and are highly-compressible; almost all of your browser's cached data will be compressible -- images and locally-retained code (CSS, js) can be compressed significantly. Documents, most types of photos/images, and things like save game files are also compressible.

If you're experiencing semantic satiation of the word "compressible" or "compression," you're not the only one. Moving on.

Test Suite Composition: 

As this isn't meant to be a full review and is more of a performance experiment, I used the following tools:

Synthetic:

  • iometer 08.
  • ATTO (results unpublished as they mirrored what we saw in the other tests).
  • Anvil's Storage Utilities.

Real-World:

  • Adobe Premiere encoding pass.
  • Handbrake transcoding pass.
  • WinRAR (compression/archival).
  • WinRAR (extraction).
  • TranStat (in-house; file transfer utility).
  • Windows 7 boot time

Each type of test will hit the drive in a different way.

Iometer 08: 

Iometer was utilized for preconditioning, 4K random testing, and a 1K random test. The tests conducted were of varying queue depths (QD), set for QD1, QD4, and QD32. In trace testing, I've found that gaming and everyday use (read: browsing, music playback, games, Steam) tends to rest in the QD1 and QD2 range; pure gaming is almost always a queue depth of 1 with a read size of 1KB. This means that the games I've traced will call for 1KB files most frequently and in a "shallow" fashion. I want to make clear that this isn't a definitive rule for all games and that we use it strictly as a guideline; a 1KB QD1 test is not an end-all test for "gaming performance."

Very few users will ever queue up more than 2-4 files to the device, and those who push more than that likely already know who they are.

Anvil's Storage Utilities: 

I first discovered Anvil on the XtremeSystems forums, where he's been building and maintaining a rapidly upcoming (free) SSD benchmark utility. Anvil's Storage Utilities can perform endurance testing, threaded I/O testing, variability of compression, iometer interpretation, and has great customization for the tester. We primarily used the core SSD benchmark tool with an 8GB test size and 46% compression, which is regarded as a "real-world scenario."

Adobe Premiere Encoding Pass: 

As I mentioned in the already-linked dual-channel vs. single-channel memory platform test, I maintain the site's YouTube channel and perform almost all of the video editing. Video editing and encoding are intensive on all system resources -- the CPU, RAM, the GPU in modern tools, and the disk are all hit heavily during rendering and file writing. For this test, I rendered-out a 60s clip at 28Mbps using the H.264 specification. The test was primarily to look at write performance at the end of the encoding pass.

Handbrake: 

Handbrake is more heavy on RAM and the CPU than anything as it is a transcoding tool, but for sake of being a complete in testing, I threw it into this benchmark. Settings used were 28Mbps output at 60FPS constant with a 5.1 H.264 profile.

WinRAR: 

This is where it starts to become relatable for a lot of users -- especially anyone doing home media server work handling regular log compression (note that the V300 is obviously not an enterprise SSD by any stretch of the imagination). We used WinRAR to compress a 10.3GB collection of music and movies into a single archive and timed its performance on each device. We later extracted the archive and timed the extraction ('uncompression') process.

TranStat (GamersNexus In-House Tool): 

Several years ago we had this simple utility developed in-house for the site. All it does is execute file transfers using standard Windows functions -- no functionality is changed within Windows, it just logs the elapsed time accurately and tells us individual file transfer times. I used this to transfer 10.3GB of mostly incompressible music and movie files, which gives us some insight as to how long a file copy would take with each SSD in a real-world scenario involving media.

Boot Times:

Simple: Just a warm boot with a stop-watch.

Once and For All: Kingston 's V300S vs. V300A Benchmark Performance                              

That flight I mentioned is getting closer with each paragraph. Time to let the graphs do the talking:

In-House Trace-Based Real-World Tests (that's a lot of hyphens) 

v300synch-bench1

In this chart, we see the results of our Windows boot test, WinRAR compression & extraction tests, Handbrake transcoding & Premiere encoding passes, and the media copy pass. Going down the line:

Windows boot times are largely unaffected in our Win7 64-bit Pro install, which is somewhat expected, since the biggest claim against asynchronous NAND thus far has been its incompressible performance (Windows is heavily-compressed). No noteworthy difference. This is within margin of error and for all purposes can be considered "identical."

WinRAR Extraction and compression had pretty noticeable changes, depending on what you're doing that requires it. There's almost a 30% difference in extraction time (20s vs. 27s) and a ~15% difference in compression time. It might be a couple seconds in these tests, but when dealing with larger files on a more regular basis, it could be important for some users to consider. For most, I'd imagine, this is not really all that common of an occurrence unless dealing with large media downloads.

Handbrake transcoding was identical; this is because it is bound by other resources more than the disk.

The media copy saw a pretty large performance differential favoring the V300S -- the drive that is no longer available. The presently-shipping V300A, as I've called it, operates 51.7% slower than the device with synchronous NAND. Although this might not seem like a big deal when you're looking at ~100s vs. ~170s 10.3GB transfer, it'll matter a whole lot more when you're moving 100GB of data on anything remotely resembling a regular basis.

Premiere's encoding difference was much smaller, landing at around 6.3% in favor of the V300S. Keep in mind that in all the above bars, lower is better because we're dealing with seconds.

Anvil's Storage Utilities: Sequential & Random Tests - Asynchronous vs. Synchronous 

v300synch-bench2

v300synch-bench3 v300synch-bench4

Higher is now better.

It seems that the device with asynchronous NAND suffers severely in large sequential operations and in operations of a long queue depth. Most end-users should be heavily-focused on the 4K random metrics in a QD range of 1-4. Sequential transfers include large file reads and writes, like writing to a movie file (rendering) or copying one elsewhere. These were somewhat touched on in the Premiere and copy tests.

The V300A (current version of the V300) performs just fine for low queue-depth random operations like the 4K QD1 and 32K QD1 passes. Even at QD4 for 4K files, we see reasonable performance, though the gap is more noticeable here.

Iometer reinforces these findings.

Iometer 08 - Random Read & Sequential Tests 

v300synch-bench5

v300synch-bench6

We see similar results in iometer as we did in Anvil's utility, though the numbers are different largely due to the compression (Anvil was 46% compressed) and test method. Again, we see the V300A performs acceptably in read operations of a random nature and in our 1K QD1 "gaming" simulation, but starts suffering in high-QD write operations.

Conclusion: The Impact of a NAND Switch & Lessons Learned

In talking with Kingston, it was stated that the company felt they did "nothing wrong" by switching NAND suppliers -- and I'd agree with that at a top-level. It happens regularly in the industry, it's just supply and demand; one supply dries up, is depleted, or gets too expensive, and you've got to keep your existing product lines alive. Kingston also stated, however, that they've learned to be more vocal and public about such changes in the future. It was the relative silence of the change that made purchasers feel somehow violated or otherwise left with an inferior product. That is also an agreeable feeling. Purchasers were likely referencing professional benchmarks conducted on the original V300 SSD -- a device that performs significantly better in some applications than the current model of the same name and appearance -- and were expecting very similar results.

All of this in the open, the question still remains as to whether the V300 in its current state is a worthwhile purchase. Here's the thing: The 120GB V300 is available for $75 via Newegg and $70 via Amazon, making it one of the lowest cost-per-GB SSDs on the market (matched against the M500 right now). Its performance was acceptable (comparatively good, even) for every-day use-case scenarios. If you're just browsing, doing office work, and playing a few games, the performance impact of the NAND switch will likely go largely unnoticed. Anyone on a budget that can't tolerate another $10 increase should still consider the V300 and those working on laptops might look into it for basic speed boosts, if not doing heavy-duty work.

That stated, Kingston's own HyperX 3K SSD of an identical capacity runs $85 on Newegg and Amazon -- $10 more. Without even looking at other brands (Samsung's EVO is within spitting distance; Crucial and SanDisk have nearby offerings), Kingston's very own HyperX largely invalidates the existence of its V300 for our audience. A $10 expenditure increase in favor of a significantly faster device is worth it for the power users who are likely reading this article; the minute you start dealing heavily in incompressible data, rendering videos (that 6% increase looks good when you're encoding 20 hours a day), or constantly moving files around, it's worth the $10.

For an "every-day user," sure, I guess you can save ten bucks with the V300, and that seems to be the audience that Kingston is targeting with the device; I can't really fault them for that. It's targeted as a device that's still several times faster than a traditional 7200RPM HDD but is more affordable than higher-end SSDs, so they succeed in landing within that market segment. I can't speak to the endurance of the drive since we haven't had it long enough to run those tests, but it should theoretically be identical to the first iteration of the V300.

So then, ignoring that aspect and looking strictly at performance, it's still a good product for some users -- it's just gotten a lot of flak for the silent NAND switch that was, in some cases, perhaps undeserved. But when you see a 50% swing in performance for certain data types -- media especially -- it's understandable to see the fierce backlash. The end-all verdict is: Power users - get something better, $10 more is a big difference in performance; mainstream / every-day users - if cost is an obstacle - don't get too hung-up on this.

Please let us know in the comments below if you like this level of depth and research or if you'd prefer to see shorter analysis in the future. If you have other thoughts, would like me to add to this test suite, or have suggestions for the next research/experiment article, please post it!

- Steve "Lelldorianx" Burke.


Last modified on Monday, 17 March 2014 11:32

From Around the Web