Coroutines execute differently from other script code. Most script code simply appears within a performance trace in a single location, beneath a specific Unity callback invocation. However, the CPU code of coroutines always appears in two places in a trace.
All of the initial code in a coroutine, from the start of the coroutine method until it yields for the first time, appears in the trace wherever the coroutine is started. Usually, it appears wherever the
StartCoroutine method is called. Coroutines generated from Unity callbacks (such as
Start callbacks that return an
IEnumerator) first appear within their respective Unity callback.
All of the rest of a coroutine’s code – from the first time it resumes until it finished executing – appears within the
DelayedCallManager line that appears inside Unity’s main loop.
To understand why this occurs, consider how a coroutine actually is executed.
Coroutines are backed by an instance of a class that is autogenerated by the C# compiler. This object is needed to track the state of the coroutine across multiple invocations of what is, to the programmer, a single method. Because local-scope variables within the coroutine must persist across
yield calls, those local-scope variables are hoisted into the generated class and therefore remain allocated on the heap for the duration of the coroutine. This object also tracks the internal state of the coroutine: it remembers at what point in the code the coroutine must be resume after yielding.
Because of this, the memory pressure caused by starting a coroutine is equal to a fixed overhead cost plus the size of its local-scope variables.
The code which starts a coroutine constructs and invokes this object, and then Unity’s
DelayedCallManager invokes it again whenever the coroutine’s yield condition has been satisfied. As coroutines usually start outside of other coroutines, this splits the cost of their execution into the two locations described above.
This can be observed in the above screenshot, where the
DelayedCallManager is resuming several different coroutines:
LoadDatabase are the notable ones.
When possible, it is better to condense a series of operations down to the fewest number of individual coroutines possible. While nested coroutines are excellent for code clarity and maintenance, they impose a higher memory overhead due to the coroutine tracking objects.
If a coroutine runs nearly every frame and does not yield on long-running operations, it is generally more readable to replace it with an
LateUpdate callback. This is particularly true of long-running or infinitely-looping coroutines.
It is important to remember that coroutines are not threads. Synchronous operations running within a coroutine still execute on the main thread. If the goal is to reduce CPU time spent on the main thread, it is just as important to avoid blocking operations in coroutines as in any other script code.
Coroutines are best employed when dealing with long asynchronous operations, such as waiting for HTTP transfers, Asset loads or file I/O to complete.