Profiling
Practical guide to optimization for mobiles

Optimizations

Just like on PCs, mobile platforms like iOSApple’s mobile operating system. More info
See in Glossary
and Android have devices of various levels of performance. You can easily find a phone that’s 10x more powerful for renderingThe process of drawing graphics to the screen (or to a render texture). By default, the main camera in Unity renders its view to the screen. More info
See in Glossary
than some other phone. Quite easy way of scaling:

  1. Make sure it runs okay on baseline configuration
  2. Use more eye-candy on higher performing configurations:
    • Resolution
    • Post-processing
    • MSAA
    • Anisotropy
    • Shaders
    • Fx/particles density, on/off

Focus on GPUs

Graphics performance is bound by fillrate, pixelThe smallest unit in a computer image. Pixel size depends on your screen resolution. Pixel lighting is calculated at every screen pixel. More info
See in Glossary
and geometric complexity (vertex count). All three of these can be reduced if you can find a way to cull more renderers. Occlusion cullingA Unity feature that disables rendering of objects when they are not currently seen by the camera because they are obscured (occluded) by other objects. More info
See in Glossary
could help here as Unity will automatically cull objects outside the viewing frustum.

On mobiles you’re essentially fillrate bound (fillrate = screen pixels * shader complexity * overdraw), and over-complex shadersA small script that contains the mathematical calculations and algorithms for calculating the Color of each pixel rendered, based on the lighting input and the Material configuration. More info
See in Glossary
is the most common cause of problems. So use mobile shaders that come with Unity or design your own but make them as simple as possible. If possible simplify your pixel shaders by moving code to vertex shaderA program that runs on each vertex of a 3D model when the model is being rendered. More info
See in Glossary
.

If reducing the Texture Quality in Quality settings makes the game run faster, you are probably limited by memory bandwidth. So compress textures, use mipmaps, reduce texture size, etc.

LOD (Level of Detail) - make objects simpler or eliminate them completely as they move further away.

Good practice

Mobile GPUs have huge constraints in how much heat they produce, how much power they use, and how large or noisy they can be. So compared to the desktop parts, mobile GPUs have way less bandwidth, low ALU performance and texturing power. The architectures of the GPUs are also tuned to use as little bandwidth & power as possible.

Unity is optimized for OpenGL ES 2.0, it uses GLSL ES (similar to HLSL) shading language. Built in shaders are most often written in HLSL (also known as Cg). This is cross compiled into GLSL ES for mobile platforms. You can also write GLSL directly if you want to, but doing that limits you to OpenGL-like platforms (e.g. mobile + Mac) since there currently are no GLSL->HLSL translation tools. When you use float/half/fixed types in HLSL, they end up highp/mediump/lowp precision qualifiers in GLSL ES.

Here is the checklist for good practice:

  1. Keep the number of materials as low as possible. This makes it easier for Unity to batch stuff.
  2. Use texture atlases (large images containing a collection of sub-images) instead of a number of individual textures. These are faster to load, have fewer state switches, and are batching friendly.
  3. Use Renderer.sharedMaterial instead of Renderer.material if using texture atlases and shared materials.
  4. Forward rendered pixel lights are expensive.
    • Use light mapping instead of realtime lights where ever possible.
    • Adjust pixel light count in Quality settings. Essentially only the directional light should be per pixel, everything else - per vertex. Certainly this depends on the game.
  5. Experiment with Render Mode of Lights in the Quality settings to get the correct priority.
  6. Avoid Cutout (alpha test) shaders unless really necessary.
  7. Keep Transparent (alpha blend) screen coverage to a minimum.
  8. Try to avoid situations where multiple lights illuminate any given object.
  9. Try to reduce the overall number of shader passes (Shadows, pixel lights, reflections).
  10. Rendering order is critical. In general case:
    • fully opaque objects roughly front-to-back.
    • alpha tested objects roughly front-to-back.
    • skyboxA special type of Material used to represent skies. Usually six-sided. More info
      See in Glossary
      .
    • alpha blended objects (back to front if needed).
  11. Post ProcessingA process that improves product visuals by applying filters and effects before the image appears on screen. You can use post-processing effects to simulate physical camera and film properties, for example Bloom and Depth of Field. More info
    See in Glossary
    is expensive on mobiles, use with care.
  12. Particles: reduce overdraw, use the simplest possible shaders.
  13. Double buffer for Meshes modified every frame:
void Update (){
  // flip between meshes
  bufferMesh = on ? meshA : meshB;
  on = !on;
  bufferMesh.vertices = vertices; // modification to mesh
  meshFilter.sharedMesh = bufferMesh;
}

Shader optimizations

Checking if you are fillrate-bound is easy: does the game run faster if you decrease the display resolution? If yes, you are limited by fillrate.

Try reducing shader complexity by the following methods:

  • Avoid alpha-testing shaders; instead use alpha-blended versions.
  • Use simple, optimized shader code (such as the “Mobile” shaders that ship with Unity).
  • Avoid expensive math functions in shader code (pow, exp, log, cos, sin, tan, etc). Consider using pre-calculated lookup textures instead.
  • Pick lowest possible number precision format (float, half, fixedin Cg) for best performance.

Focus on CPUs

It is often the case that games are limited by the GPU on pixel processing. So they end up having unused CPU power, especially on multicore mobile CPUs. So it is often sensible to pull some work off the GPU and put it onto the CPU instead (Unity does all of these): meshThe main graphics primitive of Unity. Meshes make up a large part of your 3D worlds. Unity supports triangulated or Quadrangulated polygon meshes. Nurbs, Nurms, Subdiv surfaces must be converted to polygons. More info
See in Glossary
skinningThe process of binding bone joints to the vertices of a character’s mesh or ‘skin’. Performed with an external tool, such as Blender or Autodesk Maya. More info
See in Glossary
, batching of small objects, particle geometry updates.

These should be used with care, not blindly. If you are not bound by draw calls, then batching is actually worse for performance, as it makes culling less efficient and makes more objects affected by lights!

Good practice

  • FindObjectsOfType (and Unity getter properties in general) are very slow, so use them sensibly.
  • Set the Static property on non-moving objects to allow internal optimizations like static batchingA technique Unity uses to draw GameObjects on the screen that combines static (non-moving) GameObjects into big Meshes, and renders them in a faster way. More info
    See in Glossary
    .
  • Spend lots of CPU cycles to do occlusion culling and better sorting (to take advantage of Early Z-cull).

Physics

Physics can be CPU heavy. It can be profiled via the Editor profilerA window that helps you to optimize your game. It shows how much time is spent in the various areas of your game. For example, it can report the percentage of time spent rendering, animating or in your game logic. More info
See in Glossary
. If Physics appears to take too much time on CPU:

  • Tweak Time.fixedDeltaTime (in Project settings -> Time) to be as high as you can get away with. If your game is slow moving, you probably need less fixed updates than games with fast action. Fast paced games will need more frequent calculations, and thus fixedDeltaTime will need to be lower or a collision may fail.
  • Physics.solverIterationCount (Physics settings).
  • Use as little Cloth objects as possible.
  • Use RigidbodiesA component that allows a GameObject to be affected by simulated gravity and other forces. More info
    See in Glossary
    only where necessary.
  • Use primitive collidersAn invisible shape that is used to handle physical collisions for an object. A collider doesn’t need to be exactly the same shape as the object’s mesh - a rough approximation is often more efficient and indistinguishable in gameplay. More info
    See in Glossary
    in preference mesh collidersA free-form collider component which accepts a mesh reference to define its collision surface shape. More info
    See in Glossary
    .
  • Never ever move a static collider (ie a collider without a Rigidbody) as it causes a big performance hit. It shows up in Profiler as "Static Collider.Move’ but actual processing is in Physics.Simulate. If necessary, add a RigidBody and set isKinematic to true.
  • On Windows you can use NVidia’s AgPerfMon profiling tool set to get more details if needed.

Android

GPU

These are the popular mobile architectures. This is both different hardware vendors than in PC/console space, and very different GPU architectures than the “usual” GPUs.

  • ImgTec PowerVR SGX - Tile based, deferred: render everything in small tiles (as 16x16), shade only visible pixels
  • NVIDIA Tegra - Classic: Render everything
  • Qualcomm Adreno - Tiled: Render everything in tile, engineered in large tiles (as 256k). Adreno 3xx can switch to traditional.
  • ARM Mali Tiled: Render everything in tile, engineered in small tiles (as 16x16)

Spend some time looking into different rendering approaches and design your game accordingly. Pay especial attention to sorting. Define the lowest end supported devices early in the dev cycle. Test on them with the profiler on as you design your game.

Use platform specific texture compressionA method of storing data that reduces the amount of storage space it requires. See Texture Compression3D Graphics hardware requires Textures to be compressed in specialised formats which are optimized for fast Texture sampling. More info
See in Glossary
, Animation CompressionThe method of compressing animation data to significantly reduce file sizes without causing a noticable reduction in motion quality. Animation compression is a trade off between saving on memory and image quality. More info
See in Glossary
, Audio Compression, Build Compression.
See in Glossary
.

Further reading

Screen resolution

Android version

iOS

GPU

Only PowerVR architecture (tile based deferred) to be concerned about.

  • ImgTec PowerVR SGX. Tile based, deferred: render everything in tiles, shade only visible pixels.

This means:

  • Mipmaps are not so necessary.
  • AntialiasingA technique for decreasing artifacts, like jagged lines (jaggies), in images to make them appear smoother. More info
    See in Glossary
    and aniso are cheap enough, not needed on iPad 3 in some cases.

And cons:

  • If vertex data per frame (number of vertices * storage required after vertex shader) exceeds the internal buffers allocated by the driver, the sceneA Scene contains the environments and menus of your game. Think of each unique Scene file as a unique level. In each Scene, you place your environments, obstacles, and decorations, essentially designing and building your game in pieces. More info
    See in Glossary
    has to be “split” which costs performance. The driver might allocate a larger buffer after this point, or you might need to reduce your vertex count. This becomes apparent on iPad2 (iOS 4.3) at around 100 thousand vertices with quite complex shaders.
  • TBDR needs more transistors allocated for the tiling and deferred parts, leaving conceptually less transistors for “raw performance”. It’s very hard (i.e. practically impossible) to get GPU timing for a draw call on TBDR, making profiling hard.

Further reading

Screen resolution

iOS version

Dynamic Objects

Asset Bundles

  • Asset Bundles are cached on a device to a certain limit
  • Create using the Editor API
  • Load using WWW API: WWW.LoadFromCacheOrDownload or as a resource: AssetBundle.CreateFromMemory or AssetBundle.CreateFromFile
  • Unload using AssetBundle.Unload. There is an option to unload the bundle, but keep the loaded asset from it. Also can kill all the loaded assets even if they’re referenced in the scene
  • Resources.UnloadUnusedAssets unloads all assets no longer referenced in the scene. So remember to kill references to the assets you don’t need. Public and static variables are never garbage collected.
  • Resources.UnloadAsset unloads a specific asset from memory. It can be reloaded from disk if needed.

Is there any limitation for download numbers of Assetbundle at the same time on iOS? (e.g Can we download over 10 assetbundles safely at the same time(or every frame)? )

Downloads are implemented via async API provided by OS, so OS decides how many threads need to be created for downloads. When launching multiple concurrent downloads you should keep in mind total device bandwidth it can support and amount of free memory. Each concurrent download allocates its own temporal buffer, so you should be careful there to not run out of memory.

Resources

  • Assets need to be recognized by Unity to be placed in a build.
  • Add .bytes file extension to any raw bytes you want Unity to recognize as a binary data.
  • Add .txt file extension to any text files you want Unity to recognize as a text asset
  • Resources are converted to a platform format at a build time.
  • Resources.Load()

Issue checklist

  • Textures without proper compression
  • Different solutions for different cases, but be sure to compress textures unless you’re sure you should not.
  • ETC/RGBA16 - default for android but can tweak depending on the GPU vendor. Best approach is to use ETC where possible. Alpha textures can use two ETC files with one channel being for alpha
  • PVRTCPowerVR Texture Compression (PVRTC) is a fixed-rate texture format that compresses textures to significantly reduce file sizes without causing a noticable reduction in image quality. More info
    See in Glossary
    - default for iOS, good for most cases
  • Textures having Get/Set pixels enabled - doubles the footprint, uncheck unless Get/Set is needed
  • Textures loaded from JPEG/PNGs on the runtime will be uncompressed
  • Big mp3 files marked as decompress on load
  • Additive scene loading
  • Unused Assets that remain uncleaned in memory.
  • If it randomly crashes, try on a devkit or a device with 2 GB memory (like Ipad 3).

Sometimes there’s nothing in the console, just a random crash

  • Fast script call and stripping may lead to random crashes on iOS. Try without them.

Did you find this page useful? Please give it a rating:

Profiling
Practical guide to optimization for mobiles