In the past couple of years we've seen the creation of a number of new low level graphics APIs. Arguably the first major initiative was AMD's Mantle API, which promised to improve performance on any GPUs that used their Graphics Core Next (GCN) architecture. Microsoft followed suit in March of 2014, with the announcement of DirectX 12 at the 2014 Game Developers Conference. While both of these APIs promise to give developers more direct access to graphics hardware in the PC space, there was still no low level graphics API for mobile devices, with the exception of future Windows tablets. That changed in the middle of 2014 at WWDC, where Apple surprised a number of people by revealing a new low level graphics and compute API that developers could use on iOS. That API is called Metal.

The need for a low level graphics API in the PC space has been fairly obvious for some time now. The level of abstraction in earlier versions of DirectX and OpenGL allows them to work with a wide variety of graphics hardware, but this comes with a significant amount of overhead. One of the biggest issues caused by this is reduced draw call throughput. A simple explanation of a draw call is that it is the command sent by the CPU which tells the GPU to render an object (or part of an object) in a frame. CPUs are already hard-pressed to keep up with high-end GPUs even with a low level API, and the increased overhead of a high level graphics API further reduces the amount that can be issued in a given period of time. This overhead mainly exists because most graphics APIs will do shader compilation and state validation (ensuring API use is valid) when a draw call is made, which takes up valuable CPU time that could be used to do other things like physics processing or drawing more objects.

Because a draw call involves the CPU preparing materials be rendered, developers can use tricks such as batching, which involves grouping together items of the same type to be rendered with a single draw call. Even this can present its own issues, such as objects not being culled when they are out of the frame. Another trick is instancing, which involves making a draw call for a single object that appears many times, and having the GPU duplicate it to various coordinates in the frame. Despite this, the overhead of the graphics API combined with the time that it takes the CPU itself to issue a draw call ultimately limits how many can be made. This reduces the number of unique objects developers can put on screen, as well as the amount of CPU time that is available to perform other tasks. Low level graphics APIs aim to address this by removing much of the overhead that exists in current graphics APIs.

The question to ask is why do Apple and iOS developers need a low level graphics API for their mobile games? The answer ends up being the same as the PC space. While the mobile space has seen tremendous improvements in both CPU and GPU processing power, the pace of CPU improvements is slowing when compared to GPU improvements. In addition, the increases GPU processing power were always of a greater magnitude than the CPU increases. You can see this in the chart above, which shows the level of the CPU and GPU performance of the iPad relative to its original model. Having CPU performance improve by a factor of twelve in less than five years is extremely impressive, yet it pales in comparison to the GPU performance which, in the case of the iPad Air 2, is 180 times faster than its original version.

Because of this widening gap between CPU and GPU speeds, it appears that even mobile devices have begun to experience the issue of the GPU being able to draw things much faster than the CPU can issue commands to do so. Metal aims to address this issue by cutting through much of the abstraction that exists in OpenGL ES, and this is possible in part because of Apple's control over their hardware and software in their devices. Apple designs their own CPU architectures, and while they don't design the GPU architecture, it's clear they're free to do what they desire to with the IP to create the GPUs they need.

The other side of the discussion is compatibility. Much of the abstraction in higher level graphics APIs is done to support a wide variety of hardware. Low level graphics APIs often are not as portable or widely compatible as high level ones, and this is also true of Metal. The iOS Metal API currently only works on devices that use GPUs based on Imagination Technologies' Rogue architecture, which limits it to devices that use Apple's A7, A8, and A8X SoCs.

This can pose a dilemma for developers, as programming only for Metal limits the number of users they can target with their application. The number of older iPads and iPhones still in use, as well as Apple's insistence on selling the original iPad Mini and iPod Touch which use their A5 SoC from 2011, can limit the market for games that use Metal. If I were to make a prediction, it would be that Metal's adoption among iOS developers will grow substantially in the next year or two when devices that use the A5 and A6 chips are retired from sale.

Kishonti Informatics, the developer of the GFXBench GPU benchmarking application, have released a new version of their benchmark. The new benchmark is called GFXBench Metal, and it's essentially the same benchmark as the normal GFXBench 3.0 / 3.1. The difference is that this version of the benchmark has been built to use Apple's Metal API rather than OpenGL ES. Although it's not one of the first Metal applications, it's one of the first benchmarks that can give some insight into what improvements developers and users can see when games and other 3D applications are built using Metal rather than OpenGL ES.

Before getting into the results, I did want to address one disparity that may be noticed about the non-Metal iPad Air 2 results. It appears that Apple has been making some driver optimizations for the A8X GPU with iOS releases that have come out since our original review. Because of this, the iPad Air 2's performance in the OpenGL version of GFXBench 3.0 is noticeably improved over our original results. To avoid incorrectly characterizing the improvements that Metal brings to the table, all of the iPad tests for the OpenGL and Metal versions of the benchmark were re-run on iOS 8.3. Those are the results that are used here. Testing with the iPhone 5s and 6 revealed that there are no notable improvements to the performance of Apple A7 and A8 devices.

GFXBench 3.0 Driver Overhead Test (Offscreen)

GFXBench 3.0's driver overhead test is one we don't normally publish, but in this circumstance it's one of the most important tests to examine. What this test does is render a large number of very simple objects. While that sounds like an easy task, the test renders each object one by one, and issues a separate draw call for each. This is essentially the most inefficient way possible to render the scene, as the GPU will be limited by the draw call throughput of the CPU and the graphics API managing them.

In this test, it's clear that Metal provides an enormous increase in performance. Even the lowest performance improvement for a device on Metal compared to OpenGL is still well over a 3x increase. While this test is obviously very artificial, it's an indication that Metal does indeed provide an enormous improvement in draw call throughput for developers to take advantage of.

GFXBench 3.0 Manhattan (Offscreen)

GFXBench 3.0 T-Rex HD (Offscreen)

While the driver overhead test is an interesting way of looking at how Metal allows for more draw call throughput, it's important to look at how it performs with actual graphics tests that simulate the type of visuals you would see in a 3D game. In both the Manhattan and T-Rex HD parts of GFXBench we do see an improvement when using Metal instead of OpenGL ES, but the gains are not enormous. The iPad Air 2 shows the greatest improvement, with an 11% increase in frame rate in T-Rex HD, and an 8.5% increase in Manhattan.

The relatively small improvements in these real world benchmarks illustrate an important point about Metal, which is that it is not a magic bullet to boost graphics performance. While there will definitely be small improvements due to general API efficiency and lower overhead, Metal's real purpose is to enable new levels of visual fidelity that were previously not possible on mobile devices. An example of this is the Epic Zen Garden application from Epic Games. The app renders at 1440x1080 with 4x MSAA on the iPad, and it displays 3500 animated butterflies on the screen at the same time. This scene has an average of 4000 draw calls per frame, which is well above what can currently be achieved with OpenGL ES on mobile hardware.

I think that Metal and other low level graphics APIs have a bright future. The introduction of Metal on OS X can simplify the process of bringing games to both Apple's desktop and mobile platforms. In the mobile space, developers of the most complicated 3D applications and games will be eager to adopt Metal as they begin to hit the limits of what visuals can be accomplished under OpenGL ES. While there are titles like Modern Combat 5 which use both Metal and OpenGL ES depending on the device, that method of development prevents you from using any of Metal's advantages effectively, as they will not scale to the OpenGL ES version. I cannot stress enough how much the continued sale of Apple A5 and A6 devices impedes the transition to using Metal only, and I hope that by the time Apple updates their product lines again those devices will be gone from sale, and eventually gone from use. Until that time, we'll probably see OpenGL ES continue to be used in most mobile game titles, with Metal serving as a glimpse of the mobile games that are yet to come.

Comments Locked

34 Comments

View All Comments

  • jameskatt - Friday, June 26, 2015 - link

    NEVER. Why follow when Apple can lead? Metal is superior since it is included with EVERY iPhone and Mac. And it is optimized for iOS and OS X. Vulcan will suffer the fate that Java apps suffer - pandering to the lowest common denominator. That would be unacceptable to Apple's customers.
  • xdrol - Tuesday, June 16, 2015 - link

    Don't assume Apple started working on Metal last year. It's a multi-year process, started when Khronos was not even thinking about Vulcan. Now Vulcan is there, but there is no implementation from any vendor. Metal is on the other hand a working solution as of now. Should they just ditch it to a not-yet-working API just to support open standards?
  • jwcalla - Monday, June 15, 2015 - link

    I'm surprised by the obsession with draw calls in the industry.
  • Agent_007 - Monday, June 15, 2015 - link

    It is obsession for some, but main reason for this draw call hype is current sad state of PC hardware evolution. NVIDIA and AMD GPU's are limited by 28nm process, and Intel doesn't even bother with their CPUs. So only way to increase GFX quality in games nowadays is to draw more objects with same hardware resources.

    For PC that works quite well, but for mobile it just causes more issues since many mobile devices are VERY fillrate limited, and additional overdraw caused by more objects drawn to the screen just kills the performance completely.
  • defaultluser - Monday, June 15, 2015 - link

    Not for Apple, which has maintained an unbroken chain of PowerVR devices.

    Thanks to tile based deferred rendering (TBDR), you get effectively zero overdraw for opaque objects.

    http://www.anandtech.com/show/4686/samsung-galaxy-...

    Overdraw used to be a major problem for Nvidia and AMD, but they introduced Early Z for Occlusion Culling, which gets some of the benefits of TBDR for less triangle sorting overhead.

    So yeah, even immediate mode architectures will benefit form this to a degree. Overdraw is not the massive problem it was back on the GeForce 2 folks.
  • WinterCharm - Monday, June 15, 2015 - link

    Discussions like the 3 comments above are why I read Anandtech! Thank you for educating me! :)
  • stephenbrooks - Monday, June 15, 2015 - link

    Draw calls: the reason it's become a big thing is that draw call efficiency was neglected for some time. Instead, raw pixel/texel fill and polygon rate were the benchmarks.

    However, if you want to draw many *independently moving* objects rather than one large detailed fixed one, draw calls are actually now the bottleneck. And the way to optimise draw calls actually required some API changes.

    In my tests, an OpenGL draw call is about a 600-polygon overhead. So a few big detailed models = fine, whereas many small things (butterflies in the example above) = slow.
  • kyuu - Wednesday, June 17, 2015 - link

    Nothing surprising about it if you have followed game development in the past decade or so. Consoles have had a huge advantage over PCs in draw calls, for example, due to their driver APIs being lower-level than DirectX and OpenGL (until DX12 and Vulcan come out, of course). Star Citizen's developers were struggling with it quite a bit and making customizations and supporting Mantle specifically due to draw calls limitations. Needless to say, the upcoming releases of DX12 and Vulcan are going to make their lives much easier.
  • Mercadian - Monday, June 15, 2015 - link

    "Having CPU performance improve by a factor of twelve in less than five years is extremely impressive" - it is 16x rather than 12x.
  • Brandon Chester - Monday, June 15, 2015 - link

    No it isn't. The 12x figure is directly from Apple's keynote.

Log in

Don't have an account? Sign up now