The Navi 21 GPU at the heart of the RX 6800-series has been manufactured using TSMC’s now mature 7nm process. AMD’s experience with this node in both CPU and GPU design has all fed into optimisations made to RNDA 2 for high frequency operation at low power, as well as architectural tweaks that could give it an edge. And as it turns out, that’s not the only design aspect they’re using as inspiration for this generation of graphics.
Improvements made over the first generation of RDNA have led to a very significant frequency uplift of 1.3x over the RX 5700-series, and that frequency improvement can be leveraged more widely across the die silicon. The all-important performance per watt metric has also been improved by as much as 54%, allowing AMD to squeeze more out of the silicon at all power levels. This could spell interesting things for inveterate undervolters keen to turn their new behemoth into a tame lapdog that merely sips when drawing power.
RDNA 2 performance enhancements are more than just skin deep however.
COMPUTE UNIT ENHANCED
A core building block of AMD’s RDNA GPU architecture, the Compute Unit incorporates key component units such as the stream processors (64 per CU), registers, scaler units and local caches. Improving the operation of this component will have performance ramifications throughout the rendering pipeline, and as noted RDNA 2 optimises it for much higher operating frequency.The RDNA 2 CU is also more capable in its approach to mixed precision workloads. FP32, FP64, Int32 and Int64 remain native to the chip, but other data are more widely accepted as mixed precision structures. These structures have applications beyond standard rasterisation and compute, including tensor math, which are increasingly important for a next-gen approach to rendering.
Speaking of Next Generation, RDNA 2 finally implements AMD’s Ray Accelerator for hybrid ray tracing to the tune of one per CU. The Ray Accelerator has a very specific purpose: calculating the intersection of light rays with boxes for the algorithms central to Microsoft’s DirectX Raytracing (DXR). The performance of this unit will largely determine ray tracing performance on the RX 6000-series going forward.
ADVANCED RENDER TECHNIQUE SUPPORT
Individual elements within the RNDA pipeline have been tweaked in this generation to support new advanced render techniques set to be leveraged by DirectX 12 and console analogues.Perhaps the most important from an immediate standpoint is Variable Rate Shading. This technique splits scene into tiled regions and adjusts the shading rate of each tile, devoting fewer resources to less complex areas of a scene and more effectively balancing available processing power. RDNA 2 supports 2x1, 1x2 and 2x2 VRS modes, defining the granularity and orientation of tiles used.
First introduced as part of NVIDIA’s RTX toolkit in advance of broader adoption in the industry, forms of VRS have now been incorporated into Microsoft’s DirectX Ultimate specification. The impact on performance varies from game to game, with specific genres such as driving sims seemingly able to use it exceptionally effectively. Initial figures put the performance benefit comfortably in the double-digit percentages, but thus far VRS hasn’t been put into widespread use.
Two other technologies discussed during DirectX 12 Ultimate’s launch were Sampler Feedback and Mesh Shaders, and both are supported in RDNA 2 hardware:
1. Sampler Feedback offers developers information on what parts of a texture would have needed to be sampled in order to process a sample request. The information can then be fed back into the shaders data streaming processes to more accurately assess which data needs to be streamed to memory, optimising both the memory footprint and bandwidth use of processing and storing texture data.
2. Traditional rendering pipelines process a polygonal mesh as one coherent group, utilising novel techniques to optimise but essentially operating in serial modes. By using a compute programming model, the Mesh Shader can process chunks of a given polygonal mesh of triangles, known as “meshlets”, in parallel. Important operations such as culling etc. can then also be applied to meshlets as a whole to further optimise the pipeline.
2. Traditional rendering pipelines process a polygonal mesh as one coherent group, utilising novel techniques to optimise but essentially operating in serial modes. By using a compute programming model, the Mesh Shader can process chunks of a given polygonal mesh of triangles, known as “meshlets”, in parallel. Important operations such as culling etc. can then also be applied to meshlets as a whole to further optimise the pipeline.
In the mold of DirectX 12 as a whole, both Sampler Feedback and Mesh Shaders expose more aspects of the rendering pipeline to developer eyes rather than abstracting through drivers. They also leverage Compute techniques that have become important design aspects of the GPU since entering the server space and have been crying out for a compelling application in realtime 3D rendering (i.e. gaming).
The MSDN DirectX 12 Ultimate announcement goes into greater detail on these technologies and more, with additional reading for those interested. Check it out here.