Coroutines execute differently from other script code. Most script code simply appears within a performance trace in a single location, beneath a specific Unity callback invocation. However, the CPU code of coroutines always appears in two places in a trace.
All of the initial code in a coroutine, from the start of the coroutine method until it yields for the first time, appears in the trace wherever the coroutine is started. Ufsually, it appears wherever the StartCoroutine
method is called. Coroutines generated from Unity callbacks (such as Start
callbacks that return an IEnumerator
) first appear within their respective Unity callback.
All of the rest of a coroutine’s code – from the first time it resumes until it finished executing – appears within the DelayedCallManager
line that appears inside Unity’s main loop.
To understand why this occurs, consider how a coroutine is actually executed.
Coroutines are backed by an instance of a class that is autogenerated by the C# compiler. This object is needed to track the state of the coroutine across multiple invocations of what is, to the programmer, a single method. Because local-scope variables within the coroutine must persist across yield
calls, those local-scope variables are hoisted into the generated class and therefore remain allocated on the heap for the duration of the coroutine. This object also tracks the internal state of the coroutine: it remembers at which point in the code the coroutine must be resume after yielding.
Because of this, the memory pressure caused by starting a coroutine is equal to a fixed overhead cost plus the size of its local-scope variables.
The code which starts a coroutine constructs and invokes this object, and then Unity’s DelayedCallManager
invokes it again whenever the coroutine’s yield condition has been satisfied. As coroutines usually start outside of other coroutines, this splits the cost of their execution into the two locations described above.
This can be observed in the above screenshot, where the DelayedCallManager
is resuming several different coroutines: PopulateCharacters
, AsyncLoad
and LoadDatabase
are the notable ones.
When possible, it is better to condense a series of operations down to the fewest number of individual coroutines possible. While nested coroutines are excellent for code clarity and maintenance, they impose a higher memory overhead due to the coroutine tracking objects.
If a coroutine runs nearly every frame and does not yield on long-running operations, it is generally more readable to replace it with an Update
or LateUpdate
callback. This is particularly true of long-running or infinitely-looping coroutines.
Coroutines are not stopped when an object is disabled, but only when it is definitely destroyed. This allows coroutines to still run and, if needed, enable the object again, for example. Calling Destroy(this) triggers OnDisable
immediately and the coroutines are processed. Finally, OnDestroy
is invoked at the end of the frame.
It is important to remember that coroutines are not threads. Synchronous operations running within a coroutine still execute on the main thread. If the goal is to reduce CPU time spent on the main thread, it is just as important to avoid blocking operations in coroutines as in any other script code.
Coroutines are best employed when dealing with long asynchronous operations, such as waiting for HTTP transfers, Asset loads or file I/O to complete.