A dictionary of all the most pertinent GPU / video card terminology. Learn about how video cards work with this definitions listing.
|Core Clock (GPU)||
Core Clock (GPU) -- We sometimes refer to this as the Base Clock or BCLK of the GPU, a habit from working with the BCLK (CPU); the two can be used interchangeably, though "Core Clock" is more correct when referring specifically to a GPU. The Core Clock is the operating frequency of the graphics processing chip found on the video card. This is sometimes SuperClocked or pre-overclocked, depending on manufacturer.
CUDA Cores -- Just a small part of the larger whole when it comes to an nVidia GPU. A "CUDA Core" is nVidia's equivalent to AMD's "Stream Processors." NVidia's proprietary parallel computing programming model, CUDA (Compute Unified Device Architecture), is a specialized programming language that can leverage the GPU in specific ways to perform tasks with greater performance. Each GPU can contain hundreds to thousands of CUDA cores. Architecture changes in a fashion that makes cross-generation comparisons often non-linear, but generally speaking (within a generation), more CUDA cores will equate more raw compute power from the GPU. The Kepler to Maxwell architecture jump saw nearly a 40% efficiency gain in CUDA core processing ability, illustrating the difficulty of linearly drawing comparisons without proper benchmarks.
Graphics Processing Unit -- The physical "chip" (a semiconductor) that performs all graphics computations in a system. A GPU is comprised of a silicon processor -- not unlike a CPU -- atop a substrate, which is then mounted to the host video card (or directly to the motherboard, in the case of laptops, tablets, and phones). Modern CPUs will also contain an "IGP," or Integrated Graphics Processor, which is effectively a compartmentalized GPU that shares chip space with the CPU.
|Memory Bandwidth (GPU)||
Memory Bandwidth (GPU) -- Memory bandwidth is one of the most frequently showcased stats for any new GPU, often rating in hundreds of gigabytes per second of throughput potential. Memory Bandwidth is the theoretical maximum amount of data that the bus can handle at any given time, playing a determining role in how quickly a GPU can access and utilize its framebuffer. Memory bandwidth can be best explained by the formula used to calculate it:
Memory Interface -- There are several memory interfaces throughout a computer system. As it pertains to the GPU, a Memory Interface is the physical bit-width of the memory bus. Every clock cycle (billions per second), data is transferred along a memory bus to and from the on-card memory. The width of this interface, normally defined as "384-bit" or similar, is the physical count of bits that can fit down the bus per clock cycle. A device with a 384-bit memory interface would be able to transfer 384 bits of data per clock cycle (there are 8 bits in a Byte). The memory interface is also a critical component of the memory bandwidth calculation in determining maximum memory throughput on a GPU.
Power Target % - The target wattage over max TDP (represented as a percentage) to feed the GPU. Power Target was introduced with Maxwell overclocking on NVIDIA devices as a means to increase power consumption over the lower TDP base, thus allowing further headroom for overvolting and overclocking. A higher watt draw gives more room for more power consumption by the GPU, which gives more room for overclocking. If users attempt to overclock at the stock 100% TDP on, for example, a reference GTX 980, they will eventually run into limitations with the 180W power draw; increasing Power Target to 125% (180 * 1.25 = 225W), that gives an additional 45W to increase clocks and voltage.
PowerTune (AMD) - AMD's proprietary PowerTune technology has existed since December 2011, bringing dynamic frequency scaling to AMD graphics processors. "Dynamic frequency scaling" is the act of modifying the core clock speed (frequency) on-the-fly and can be thought of as similar in concept to TurboBoost or nVidia Boost. PowerTune scales the frequency based upon performance demands, thermals, and watt draw (using P-states).
Render Output Units (ROPs) - Also known as "Raster Operations Pipeline," which is more useful in determining the application of a ROP. As with a Texture Map Unit or CPU's Floating Point Unit, a Render Output Unit is a specific component on a GPU that is responsible for the processing of final pixel values prior to drawing them to the screen. ROPs perform pixel read/write tasks that include processing pixel and texel (see: Texture Filter Rate) data; Render Output Units interpret the final depth of pixels before rendering them to the screen. More ROPs will increase the speed and bandwidth for which images are rasterized and/or rendered to the screen.
Stream Processors -- Like nVidia's "CUDA Cores," AMD's Stream Processors are a more central component of the GPU. Stream Processors and CUDA Cores are not linearly comparable due to vast architectural differences, but can be thought of as similar when it comes to the primary function of each component. Stream Processing is focused intensely on parallelism of datasets to ensure efficient processing when performing tasks that are better-suited for parallel processing. These tasks can include aspects of audio processing (stream processors are used in DSP-equipped devices -- digital signal processors
The texture filter rate of the GPU is representative of how many pixels (specifically 'texels') the GPU can render per second. This value is always represented as a measurement over time (1s). A 144.1GT/s texture fill rate comes out to 144.1 billion texels (textured picture elements) per second. At its inception, the fill-rate was a simpler spec to define: It represented the count of "complete" pixels (that is, pixels that have been completely filtered) that can be stored in the framebuffer (GPU memory). To this end, the texture fill-rate was strictly representative of the number of on-screen pixels that were filtered and written to the buffer.
Texture Mapping Unit (TMU) - A low-level GPU component that operates with some independence, entirely dedicated to manipulating bitmaps and texture filtration. TMUs modify bitmaps (resize, rotate, scale, skew, fit) for placement onto objects and filter textures; in video games, this would be represented as placing a texture onto an object, whereupon we have now created "texels" (textured pixels). Textured pixels require a unique approach to graphics processing, as explained in our "What is Texture Fill-Rate?" article.
TrueAudio (AMD) - Using a discrete DSP (Digital Signal Processor), AMD GPUs perform standalone audio processing for multi-channel simulation in gaming. TrueAudio has questionable use cases in the real-world, but in theory, it is capable of producing true 3D audio with upwards of 30+ channels simulated through audio playback devices. TrueAudio uses positional output through all of these channels to better simulate audio source locations in games.
Video Codec Engine (VCE) - AMD's on-die solution for live encoding of captured footage, to include gameplay capture. The VCE is exploited by software solutions to live-capture gameplay in a similar fashion to nVidia's ShadowPlay, which uses a Kepler (now Maxwell) encoder). The VCE is included on most modern AMD GPUs and enables live H.264, MPEG-4, ASP, MPEG-2, VC-1, and Blu-Ray 3D encoding. The purposes of the VCE is to reduce load on the CPU during gameplay capture by instead offloading the encoding to a more powerful, specialized component (the VCE) and reduce performance overhead from capture. AMD's VCE and nVidia's Kepler/Maxwell encoders both work phenomenally well at retaining native performance during capture, as seen below.