AMD is looking all set to break the price/performance barrier and get back to its glory days with the new Polaris architecture. And it begins with the launch of the much awaited Radeon RX 480 today. Priced at $199 for the 4 GB variant and $239 for the 8 GB variant, the Radeon RX 480 promises to democratise Virtual Reality (VR) by making it much more easy, economically, to set up a VR-ready PC capable of driving the likes of the Oculus Rift and HTC Vive on a single graphics card. In India, the Radeon RX 480 is priced at Rs.28,990 which is more expensive than a GTX 970 that happens to be the bare minimum for VR. That’s Indian pricing for you!
However, that’s not all that AMD has in store with the Polaris architecture, there’s support for a much wider colour gamut to support future HDR monitors and more features added to the GPUOpen initiative. Let’s have an indepth look at the Polaris architecture.
Polaris Architecture and its Features
Polaris is AMD’s first architecture to move on to the 14nm FinFET technology from the 28nm process that they’ve had since the HD 7000 series. Polaris brings to fore AMD’s 4th Gen Graphics Core Next (GCN) technology with the Polaris 10 and Polaris 11 GPUs. The RX 480 is their higher mid-range card with two more following up at lower price points. The Radeon RX 470 and the Radeon RX 460 will be based on the Polaris 10 and 11 GPUs, respectively.
Let’s take a closer look at the RX 480 to know more about Polaris 10.
The RX 480 has 4 Shader Engines consisting of compute units (CUs) with each Shader Engine containing exactly 9 CUs. Each CU, in turn, houses 1 Geometry processor, 4 texture units and 64 Stream Processors which brings the total to 2304 Stream Processors and 144 Texture units. Add 32 ROPs and you’ve got a brief overview of the RX 480 GPU. But that’s not it, there are 4 Asynchronous Compute Engines and 2 Hardware Schedulers as well. The former, being a bigger piece of the pie.
Coming to the memory part, there are going to be two variants of the RX 480, one with 4 GB VRAM and another with 8 GB of VRAM. The eight 32-bit memory controllers come together to form a 256-bit wide data bus that can support 8 GB of memory with ease. Now let’s take a closer look at an individual Compute Unit.
A Compute Unit is the fundamental unit in AMD’s GCN architecture and the core structure has remained more or less unchanged. There are 4 vector units, each of which is a 16-wide SIMD with a 64 KB register and 1 scalar unit with a 4 KB register. Essentially, this CU can handle 4 instructions at once. The Branch and Message Unit at the very beginning is what fetches and decodes instructions that are later scheduled for processing. And at the end you have Texture Fetch Load/Store Units to hold the texture files that are needed for whichever instruction is awaiting processing in the pipeline. A change that we noted was that the scalar unit has a 4 KB register in GCN 4 while GCN 1 had an 8 KB register. Maybe, there simply isn’t enough use for that much memory as most operations executed by the GPU are vector and the scalar unit only served to handle those once-in-a-blue-moon independent calculations.
Improvements include improved prefetch efficiency to reduce instruction pipeline stalls, increased buffer size for each instruction, grouping of cache requests and native FP16 / INT16 support.
Primitive Discard Accelerator
A new feature with Polaris is its Primitive Discard Accelerator. Consider a scene with way too much tessellation. At a pixel level, this translates to a lot of triangles which need to be plotted and then have textures applied. Primitive Discard Accelerator takes a look at the number of triangles in the pipeline and removes any which are redundant. By redundant, they mean triangles with zero area or those which don’t have any inclusive sample points. So the GPU ends up not spending precious GPU cycles rendering objects that are not going to be seen at all. Thus, saving a lot of resources. The performance gain as a result of this feature increases with an increase in MSAA (Multi-Sampling Anti-Aliasing).
In comparison with R9 290, the RX 480 claims to have a 15% improvement in performance per CU.
Also, the GPU is more optimised for Close To Metal APIs like DirectX 12 and Vulkan. So the time spent on rendering each frame is significantly lower resulting in lower latencies and more FPS.
Memory and Delta Colour Compression Engines
Polaris supports Lossless Delta Colour Compression(DCC) ratios of 2:1, 4:1 and 8:1. This makes for significant bandwidth savings allowing more efficient utilisation of the memory data bus. And since read/write cycles are limited for NAND memory, this results in a longer life for the VRAM.
With double the L2 cache, there is lesser need for the VRAM to be used and more instructions can be executed within the cache memory. Since cache memory is much closed to the CU this results in lower latencies which in turn helps with power efficiency and faster DCC. AMD claims this results in an overall power saving of 40% on memory transactions.
We have not received a review unit as of now. Benchmarks will be added as soon as the review sample is tested.
AMD claims that Polaris provides significantly greater performance per watt and greater performance per CU.
As mentioned earlier, the AMD Radeon RX 480 is going to be priced at Rs. 28,990 in India for the 8 GB variant. The 4 GB variants are yet to arrive and we will bring you the official pricing as and when we get them.
Preliminary benchmarks that have come out on the internet put the Polaris based RX 480 slightly ahead of the GTX 970 in quite a few gaming benchmarks and slightly behind in some. The GTX 970 costs roughly Rs.24,500 and with the Radeon RX 480 being priced at Rs. 28,990 AMD’s #BetterRed Revolution might not make a mark in India at all unless the price is brought down to Rs. 19,000 - Rs. 20,000.
UPDATE: AMD announces price drop for the Radeon RX 480. Link