docs.unity3d.com
Search Results for

    Show / Hide Table of Contents

    Python Optimizer

    mlagents.trainers.optimizer.torch_optimizer

    TorchOptimizer Objects

    class TorchOptimizer(Optimizer)
    

    create_reward_signals

     | create_reward_signals(reward_signal_configs: Dict[RewardSignalType, RewardSignalSettings]) -> None
    

    Create reward signals

    Arguments:

    • reward_signal_configs: Reward signal config.

    get_trajectory_value_estimates

     | get_trajectory_value_estimates(batch: AgentBuffer, next_obs: List[np.ndarray], done: bool, agent_id: str = "") -> Tuple[Dict[str, np.ndarray], Dict[str, float], Optional[AgentBufferField]]
    

    Get value estimates and memories for a trajectory, in batch form.

    Arguments:

    • batch: An AgentBuffer that consists of a trajectory.
    • next_obs: the next observation (after the trajectory). Used for bootstrapping if this is not a terminal trajectory.
    • done: Set true if this is a terminal trajectory.
    • agent_id: Agent ID of the agent that this trajectory belongs to.

    Returns:

    A Tuple of the Value Estimates as a Dict of [name, np.ndarray(trajectory_len)], the final value estimate as a Dict of [name, float], and optionally (if using memories) an AgentBufferField of initial critic memories to be used during update.

    mlagents.trainers.optimizer.optimizer

    Optimizer Objects

    class Optimizer(abc.ABC)
    

    Creates loss functions and auxillary networks (e.g. Q or Value) needed for training. Provides methods to update the Policy.

    update

     | @abc.abstractmethod
     | update(batch: AgentBuffer, num_sequences: int) -> Dict[str, float]
    

    Update the Policy based on the batch that was passed in.

    Arguments:

    • batch: AgentBuffer that contains the minibatch of data used for this update.
    • num_sequences: Number of recurrent sequences found in the minibatch.

    Returns:

    A Dict containing statistics (name, value) from the update (e.g. loss)

    In This Article
    Back to top
    Copyright © 2025 Unity Technologies — Trademarks and terms of use
    • Legal
    • Privacy Policy
    • Cookie Policy
    • Do Not Sell or Share My Personal Information
    • Your Privacy Choices (Cookie Settings)