Version: 2017.4
Memory
Asset auditing

Coroutines

Coroutines execute differently from other script code. Most script code simply appears within a performance trace in a single location, beneath a specific Unity callback invocation. However, the CPU code of coroutines always appears in two places in a trace.

All of the initial code in a coroutine, from the start of the coroutine method until it yields for the first time, appears in the trace wherever the coroutine is started. Usually, it appears wherever the StartCoroutine method is called. Coroutines generated from Unity callbacks (such as Start callbacks that return an IEnumerator) first appear within their respective Unity callback.

All of the rest of a coroutine’s code – from the first time it resumes until it finished executing – appears within the DelayedCallManager line that appears inside Unity’s main loop.

To understand why this occurs, consider how a coroutine is actually executed.

Coroutines are backed by an instance of a class that is autogenerated by the C# compiler. This object is needed to track the state of the coroutine across multiple invocations of what is, to the programmer, a single method. Because local-scope variables within the coroutine must persist across yield calls, those local-scope variables are hoisted into the generated class and therefore remain allocated on the heap for the duration of the coroutine. This object also tracks the internal state of the coroutine: it remembers at which point in the code the coroutine must be resume after yielding.

Because of this, the memory pressure caused by starting a coroutine is equal to a fixed overhead cost plus the size of its local-scope variables.

The code which starts a coroutine constructs and invokes this object, and then Unity’s DelayedCallManager invokes it again whenever the coroutine’s yield condition has been satisfied. As coroutines usually start outside of other coroutines, this splits the cost of their execution into the two locations described above.

This can be observed in the above screenshot, where the DelayedCallManager is resuming several different coroutines: PopulateCharacters, AsyncLoad and LoadDatabase are the notable ones.

When possible, it is better to condense a series of operations down to the fewest number of individual coroutines possible. While nested coroutines are excellent for code clarity and maintenance, they impose a higher memory overhead due to the coroutine tracking objects.

If a coroutine runs nearly every frame and does not yield on long-running operations, it is generally more readable to replace it with an Update or LateUpdate callback. This is particularly true of long-running or infinitely-looping coroutines.

Coroutines are not stopped when an object is disabled, but only when it is definitely destroyed. This allows coroutines to still run and, if needed, enable the object again, for example. Calling Destroy(this) triggers OnDisable immediately and the coroutines are processed. Finally, OnDestroy is invoked at the end of the frame.

It is important to remember that coroutines are not threads. Synchronous operations running within a coroutine still execute on the main thread. If the goal is to reduce CPU time spent on the main thread, it is just as important to avoid blocking operations in coroutines as in any other script code.

Coroutines are best employed when dealing with long asynchronous operations, such as waiting for HTTP transfers, Asset loads or file I/O to complete.

Memory
Asset auditing