Understand the Inference Engine workflow

To use Inference Engine to run a neural network in Unity, follow these steps:

Use the Unity.InferenceEngine namespace.
Load a neural network model file.
Create input for the model.
Create an inference engine (a worker).
Run the model with the input to compute a result (inference).
Get the result.

Tip

Use the Workflow example to understand the workflow applied to a simple example.

Use the `Unity.InferenceEngine` namespace

At the top of your script, include the Unity.InferenceEngine namespace as follows:

using Unity.InferenceEngine;

Load a model

Inference Engine can import model files in Open Neural Network Exchange (ONNX) format. To load a model, follow these steps:

Export a model to ONNX format from a machine learning framework or download an ONNX model from the Internet.
Add the model file to the Assets folder of the Project window.
Create a runtime model in your script as follows:

ModelAsset modelAsset = Resources.Load("model-file-in-assets-folder") as ModelAsset;
var runtimeModel = ModelLoader.Load(modelAsset);

You can also add public ModelAsset modelAsset as a public variable in GameObjects. In this case, specify the model manually.

For more information, refer to Import a model file.

Create input for the model

Use the Tensor API to create a tensor with data for the model. You can convert an array or a texture to a tensor. For example:

// Convert a texture to a tensor
Texture2D inputTexture = Resources.Load("image-file") as Texture2D;
Tensor<float> inputTensor = new Tensor<float>(new TensorShape(1, 4, inputTexture.height, inputTexture.width));
TextureConverter.ToTensor(inputTexture, inputTensor);
// Convert an array to a tensor
int[] array = new int[] {1,2,3,4};
Tensor<int> inputTensor = new Tensor<int>(new TensorShape(4), array);

For more information, refer to Create input for a model.

Create an inference engine (a worker)

In Inference Engine, a worker is the inference engine. You create a worker to break down the model into executable tasks, run the tasks on the central processing unit (CPU) memory or graphics processing unit (GPU), and retrieve the result.

For example, the following creates a worker that runs on the GPU using Inference Engine compute shaders:

Worker worker = new Worker(runtimeModel, BackendType.GPUCompute);

For more information, refer to Create an engine.

Schedule the model

To run the model, use the Schedule method of the worker object with the input tensor.

worker.Schedule(inputTensor);

Inference Engine schedules the model layers on the given backend. Because processing is asynchronous, some tensor operations might still be running after this call.

For more information, refer to Run a model.

Get the output

You can use methods, such as PeekOutput, to get the output data from the model. For example:

Tensor<float> outputTensor = worker.PeekOutput() as Tensor<float>;

For more information, refer to Get output from a model.