When discussing performance, it is vital to remember that all optimization attempts must start with a discovery process. Profiling an application to discover its hotspots is the necessary first step, followed by an analysis of profiling results against the project’s technical and Asset architecture.
Note: Where this section discusses method names found in native-code profiling traces, the method names have been drawn from Unity 5.3. Method names may change in future versions of Unity.
For profiling, there are many different tools available to Unity developers. Unity has a suite of built-in tools, such as the CPU Profiler, Memory Profiler and the new 5.3 Memory Analyzer.
However, the best data generally comes from platform-specific tools. These include:
For iOS: Instruments and the XCode Frame Debugger
For Android: the Snapdragon Profiler
For platforms running Intel CPUs/GPUs: VTune and Intel GPA
For PS4: the Razor suite
For Xbox: the Pix tool
These tools generally have the most utility on platforms that can utilize IL2CPP to produce a C++ version of the project. These native-code versions provide transparent callstacks and high-resolution method timings that are unavailable when running under Mono.
Unity has already created a basic guide on using Instruments to profile iOS games. It can be found here.
When looking at a trace of start-up times, there are two key methods to inspect. These two methods are the primary places where the configuration, Assets and code of a project can impact start-up time.
Note that start-up time manifests differently on different platforms. On most platforms it is visible to the user as a static splash screen.
The above screenshot is from an Instruments trace of an example project running on an iOS device. Within the platform-specific
startUnity method, note the
UnityInitApplicationGraphics performs a lot of internal work, such as setting up the graphics device and initializing many of Unity’s internal systems. In addition, it initializes the Resources system. To do this, it must load an index of all of the files contained by the Resources system.
Every Asset file within every folder named “Resources” (1) (NOTE: This only applies to folders named “Resources” that are within the project’s “Assets” folder, as well as all child folders within those “Resources” folders.) is included in the Resource system’s data. Therefore, the time required to initialize the Resources system increases at least linearly in correlation with the number of files within “Resources” folders.
UnityLoadApplication contains methods that load and initialize the first Scene in the project. This includes deserializing and instantiating all of the data necessary to display the first Scene, such as compiling Shaders, uploading Textures and instantiating GameObjects. Additionally, all MonoBehaviours in the first Scene have their
Awake callbacks executed at this time.
These processes mean that if there is any long-running code in an
Awake callback in the first Scene of a project, that code could be responsible for slowing down the project’s initial start-up time. Resolving this involves either eliminating the slow code, or executing it elsewhere in the application’s lifecycle.
For profiling traces captured after initial startup time, the primary place of interest is the method
PlayerLoop. This is Unity’s main loop, and the code within it runs once per frame.
The above screenshot is from a profiling run of an example Unity 5.4 project, and illustrates several of the most interesting methods within
PlayerLoop. Note that the names of methods within the
PlayerLoop may vary between Unity versions.
PlayerRender is the method that runs Unity’s rendering system. This includes culling objects, calculating dynamic batches, and submitting drawing instructions to the GPU. Any Image Effects or rendering-based script callbacks (
OnWillRenderObject, for example) also run here. In general, this should be the top consumer of CPU time while the project is interactive.
BaseBehaviourManager calls three templated versions of
CommonUpdate. These invoke certain callbacks within MonoBehaviours attached to active GameObjects in the current Scene.
FixedUpdate if the physics system has ticked
BaseBehaviourManager::CommonUpdate<UpdateManager> is the most interesting method family to inspect, because it is the entry point for most of the script code running within a Unity project.
There are several other methods of interest:
UI::CanvasManager invokes several different callbacks if a project uses Unity UI. This includes Unity UI’s batch computation and layout updates; the two operations that most often cause
CanvasManager to appear in the profiler.
DelayedCallManager::Update runs coroutines. These are covered in more detail in the “Coroutines” chapter of this document.
PhysicsManager::FixedUpdate runs the PhysX physics system. This primarily involves running PhysX’s internal code, and is influenced by the number of physics objects in the current Scene, such as Rigidbodies and Colliders. However, physics-based callbacks also appear here – in particular,
If the project is using 2D physics, that appears as an analogous set of calls under
When scripts are invoked on platforms cross-compiled with IL2CPP, look for trace lines containing a
ScriptingInvocation object. This is the point where Unity’s internal native code is transitioning into the script runtime in order to execute script code (2) (NOTE:
Technically, after being run through IL2CPP, C#/JS script code also becomes native code. However, this cross-compiled code primarily executes methods via the IL2CPP runtime framework and does not closely resemble handwritten C++.).
The above screenshot is from another trace from an example project running in Unity 5.4. All of the methods nested beneath the
RuntimeInvoker_Void line are part of cross-compiled C# scripts that are being executed once per frame.
The trace lines are fairly easy to read: each one is the name of the original class followed by an underscore and the name of the original method. In this example trace, it’s possible to see
PlayerShooting.Update and several other
Update methods. These are the standard Unity
Update callbacks found in most MonoBehaviours.
By expanding these methods, it is possible to discover exactly which methods within them are consuming CPU time. This includes other script methods within the project, Unity APIs and C# library code.
The above trace shows that the
StandaloneInputModule.Process method is raycasting through the entire UI once per frame, in order to detect whether any touch events are hovering over or activating any UI elements. The primary cost is iterating over all the UI elements, and testing whether the mouse’s position is within their bounding rectangle.
Asset loading can also be identified in CPU traces. The main method indicating an Asset load is
SerializedFile::ReadObject. This method connects a binary data stream (from a file) to Unity’s serialization system, which operates via a method named
Transfer method can be found on all Asset types, such as Textures, MonoBehaviours and Particle Systems.
In the above screenshot, a Scene is being loaded. This requires Unity to read and deserialize all the Assets within the Scene, as denoted by the calls to various
Transfer methods beneath
In general, if a performance stutter is seen during runtime and a performance trace shows significant time being used by
SerializedFile::ReadObject, the framerate is being reduced due to Asset loads. Note that, in most cases,
SerializedFile::ReadObject can be found on the main thread only when synchronous Asset loads are requested via the
Resources or AssetBundle APIs.
This sort of performance stutter can be remediated in the usual ways: you can make Asset loading asynchronous (which moves the heavy
ReadObject call to a worker thread), or preload certain heavy Assets.
Transfer calls also appear when cloning objects (denoted by the
CloneObject method in a trace). If a call to
Transfer appears beneath a
CloneObject call, then the Asset is not being loaded from storage. Instead, the old Object’s data is being transferred to the new Object. To do this, Unity serializes the old Object and deserializes the resulting data as the new Object.