Optimizations

Just like on PCs, mobile platforms like iOS and Android have devices of various levels of performance. You can easily find a phone that’s 10x more powerful for rendering than some other phone. Quite easy way of scaling:

  1. Make sure it runs okay on baseline configuration
  2. Use more eye-candy on higher performing configurations:
    • Resolution
    • Post-processing
    • MSAA
    • Anisotropy
    • Shaders
    • Fx/particles density, on/off

Focus on GPUs

Graphics performance is bound by fillrate, pixel and geometric complexity (vertex count). All three of these can be reduced if you can find a way to cull more renderers. Occlusion culling and could help here. Unity will automatically cull objects outside the viewing frustum.

On mobiles you’re essentially fillrate bound (fillrate = screen pixels * shader complexity * overdraw), and over-complex shaders is the most common cause of problems. So use mobile shaders that come with Unity or design your own but make them as simple as possible. If possible simplify your pixel shaders by moving code to vertex shader.

If reducing the Texture Quality in Quality Settings makes the game run faster, you are probably limited by memory bandwidth. So compress textures, use mipmaps, reduce texture size, etc.

LOD (Level of Detail) – make objects simpler or eliminate them completely as they move further away. The main goal would be to reduce the number of draw calls.

Good practice

Mobile GPUs have huge constraints in how much heat they produce, how much power they use, and how large or noisy they can be. So compared to the desktop parts, mobile GPUs have way less bandwidth, low ALU performance and texturing power. The architectures of the GPUs are also tuned to use as little bandwidth & power as possible.

Unity is optimized for OpenGL ES 2.0, it uses GLSL ES (similar to HLSL) shading language. Built in shaders are most often written in HLSL (also known as Cg). This is cross compiled into GLSL ES for mobile platforms. You can also write GLSL directly if you want to, but doing that limits you to OpenGL-like platforms (e.g. mobile + Mac) since there currently are no GLSL->HLSL translation tools. When you use float/half/fixed types in HLSL, they end up highp/mediump/lowp precision qualifiers in GLSL ES.

Here is the checklist for good practice:

  1. Keep the number of materials as low as possible. This makes it easier for Unity to batch stuff.
  2. Use texture atlases (large images containing a collection of sub-images) instead of a number of individual textures. These are faster to load, have fewer state switches, and are batching friendly.
  3. Use Renderer.sharedMaterial instead of Renderer.material if using texture atlases and shared materials.
  4. Forward rendered pixel lights are expensive.
    • Use light mapping instead of realtime lights where ever possible.
    • Adjust pixel light count in quality settings. Essentially only the directional light should be per pixel, everything else - per vertex. Certainly this depends on the game.
  5. Experiment with Render Mode of Lights in the Quality Settings to get the correct priority.
  6. Avoid Cutout (alpha test) shaders unless really necessary.
  7. Keep Transparent (alpha blend) screen coverage to a minimum.
  8. Try to avoid situations where multiple lights illuminate any given object.
  9. Try to reduce the overall number of shader passes (Shadows, pixel lights, reflections).
  10. Rendering order is critical. In general case:
    • fully opaque objects roughly front-to-back.
    • alpha tested objects roughly front-to-back.
    • skybox.
    • alpha blended objects (back to front if needed).
  11. Post Processing is expensive on mobiles, use with care.
  12. Particles: reduce overdraw, use the simplest possible shaders.
  13. Double buffer for Meshes modified every frame:
void Update (){
  // flip between meshes
  bufferMesh = on ? meshA : meshB;
  on = !on;
  bufferMesh.vertices = vertices; // modification to mesh
  meshFilter.sharedMesh = bufferMesh;
}

Sharer optimizations

Checking if you are fillrate-bound is easy: does the game run faster if you decrease the display resolution? If yes, you are limited by fillrate.

Try reducing shader complexity by the following methods:

Focus on CPUs

It is often the case that games are limited by the GPU on pixel processing. So they end up having unused CPU power, especially on multicore mobile CPUs. So it is often sensible to pull some work off the GPU and put it onto the CPU instead (Unity does all of these): mesh skinning, batching of small objects, particle geometry updates.

These should be used with care, not blindly. If you are not bound by draw calls, then batching is actually worse for performance, as it makes culling less efficient and makes more objects affected by lights!

Good practice

Physics

Physics can be CPU heavy. It can be profiled via the Editor profiler. If Physics appears to take too much time on CPU:

Android

GPU

These are the popular mobile architectures. This is both different hardware vendors than in PC/console space, and very different GPU architectures than the “usual” GPUs.

  • ImgTec PowerVR SGX - Tile based, deferred: render everything in small tiles (as 16x16), shade only visible pixels
  • NVIDIA Tegra - Classic: Render everything
  • Qualcomm Adreno - Tiled: Render everything in tile, engineered in large tiles (as 256k). Adreno 3xx can switch to traditional.
  • ARM Mali Tiled: Render everything in tile, engineered in small tiles (as 16x16)

Spend some time looking into different rendering approaches and design your game accordingly. Pay especial attention to sorting. Define the lowest end supported devices early in the dev cycle. Test on them with the profiler on as you design your game.

Use platform specific texture compression.

Further reading

Screen resolution

Android version

iOS

GPU

Only PowerVR architecture (tile based deferred) to be concerned about.

  • ImgTec PowerVR SGX. Tile based, deferred: render everything in tiles, shade only visible pixels
  • ImgTec .PowerVR MBX. Tile based, deferred, fixed function - pre iPhone 4/iPad 1 devices

This means:

  • Mipmaps are not so necessary.
  • Antialiasing and aniso are cheap enough, not needed on iPad 3 in some cases

And cons:

  • If vertex data per frame (number of vertices * storage required after vertex shader) exceeds the internal buffers allocated by the driver, the scene has to be “split” which costs performance. The driver might allocate a larger buffer after this point, or you might need to reduce your vertex count. This becomes apparent on iPad2 (iOS 4.3) at around 100 thousand vertices with quite complex shaders.
  • TBDR needs more transistors allocated for the tiling and deferred parts, leaving conceptually less transistors for “raw performance”. It’s very hard (i.e. practically impossible) to get GPU timing for a draw call on TBDR, making profiling hard.

Further reading

Screen resolution

iOS version

Dynamic Objects

Asset Bundles

Is there any limitation for download numbers of Assetbundle at the same time on iOS? (e.g Can we download over 10 assetbundles safely at the same time(or every frame)? )

Downloads are implemented via async API provided by OS, so OS decides how many threads need to be created for downloads. When launching multiple concurrent downloads you should keep in mind total device bandwidth it can support and amount of free memory. Each concurrent download allocates its own temporal buffer, so you should be careful there to not run out of memory.

Resources

Silly issues checklist

Sometimes there’s nothing in the console, just a random crash

Page last updated: 2013-07-18