Namespace Unity.MLAgents
The Academy singleton manages agent training and decision making.
An agent is an actor that can observe its environment, decide on the best course of action using those observations, and execute those actions within the environment.
Factory class for an ICommunicator instance. This is used to the Academy at startup. By default, on desktop platforms, an ICommunicator will be created and attempt to connect to a trainer. This behavior can be prevented by setting Enabled to false before the Academy is initialized.
The DecisionRequester component automatically request decisions for an Agent instance at regular intervals.
A container for the Environment Parameters that may be modified during training. The keys for those parameters are defined in the trainer configurations and the the values are generated from the training process in features such as Curriculum Learning and Environment Parameter Randomization.
One current assumption for all the environment parameters is that they are of type float.
A basic class implementation of MultiAgentGroup.
Add stats (key-value pairs) for reporting. These values will sent these to a StatsReporter
instance, which means the values will appear in the TensorBoard summary, as well as trainer
gauges. You can nest stats in TensorBoard by adding "/" in the name (e.g. "Agent/Health"
and "Agent/Wallet"). Note that stats are only written to TensorBoard each summary_frequency
steps (a trainer configuration). If a stat is received multiple times, within that period
then the values will be aggregated using the Stat
Contains exceptions specific to ML-Agents.
A class holding the capabilities flags for Reinforcement Learning across C# and the Trainer codebase.
Struct that contains all the information for an Agent, including its observations, actions and current status.
Communicator initialization parameters.
Information about Academy step used to make decisions about whether to request a decision.
An array-like object that stores up to four elements. This is a value type that does not allocate any additional memory.
Initialization parameters for the Unity environment.
This is the interface of the Communicators. This does not need to be modified nor implemented to create a Unity environment.
When the Unity Communicator is initialized, it will wait for the External Communicator to be initialized as well. The two communicators will then exchange their first messages that will usually contain information for initialization (information that does not need to be resent at each new exchange).
By convention a Unity input is from External to Unity and a Unity output is from Unity to External. Inputs and outputs are relative to Unity.
By convention, when the Unity Communicator and External Communicator call exchange, the exchange is NOT simultaneous but sequential. This means that when a side of the communication calls exchange, the other will receive the result of its previous xchange call. This is what happens when A calls exchange a single time: A sends data_1 to B -> B receives data_1 -> B generates and sends data_2 -> A receives data_2 When A calls exchange, it sends data_1 and receives data_2
Since the messages are sent back and forth with exchange and simultaneously when calling initialize, External sends two messages at initialization.
The structure of the messages is as follows: UnityMessage ...Header ...UnityOutput ......UnityRLOutput ......UnityRLInitializationOutput ...UnityInput ......UnityRLInput ......UnityRLInitializationInput
UnityOutput and UnityInput can be extended to provide functionalities beyond RL UnityRLOutput and UnityRLInput can be extended to provide new RL functionalities
MultiAgentGroup interface for grouping agents to support multi-agent training.
Determines the behavior of how multiple stats within the same summary period are combined.
Delegate for handling quit events sent back from the communicator.
Delegate for handling reset parameter updates sent from the communicator.