Profile a model
The performance of a model depends on the following factors:
The complexity of the model
Whether the model uses performance-heavy operators such as
Conv
orMatMul
The features of the platform you run the model on, for example central processing unit (CPU) memory, graphics processing unit (GPU) memory, and number of cores
Whether Inference Engine downloads data to CPU memory when you access a tensor
For more information, refer to Get output from a model.
Profile a model in the Profiler window
To get performance information when you run a model, can use the following:
- The Profiler window
- RenderDoc, a third-party graphics debugger
The Profiler window displays each Inference Engine layer as a dropdown item in the Module Details panel. Open a layer to get a detailed timeline of the implementation of the layer.
When a layer implements methods that include Download or Upload, Inference Engine transfers data to or from the CPU or the GPU. This might slow down the model.
If your model runs slower than you expect, refer to the following links:
- For information about how the complexity of a model might affect performance, refer to Understand models in Inference Engine.
- For information about different types of worker, refer to Create an engine to run a model.