docs.unity3d.com
Search Results for

    Show / Hide Table of Contents

    Python Gym API Documentation

    mlagents_envs.envs.unity_gym_env

    UnityGymException Objects

    class UnityGymException(error.Error)
    

    Any error related to the gym wrapper of ml-agents.

    UnityToGymWrapper Objects

    class UnityToGymWrapper(gym.Env)
    

    Provides Gym wrapper for Unity Learning Environments.

    __init__

     | __init__(unity_env: BaseEnv, uint8_visual: bool = False, flatten_branched: bool = False, allow_multiple_obs: bool = False, action_space_seed: Optional[int] = None)
    

    Environment initialization

    Arguments:

    • unity_env: The Unity BaseEnv to be wrapped in the gym. Will be closed when the UnityToGymWrapper closes.
    • uint8_visual: Return visual observations as uint8 (0-255) matrices instead of float (0.0-1.0).
    • flatten_branched: If True, turn branched discrete action spaces into a Discrete space rather than MultiDiscrete.
    • allow_multiple_obs: If True, return a list of np.ndarrays as observations with the first elements containing the visual observations and the last element containing the array of vector observations. If False, returns a single np.ndarray containing either only a single visual observation or the array of vector observations.
    • action_space_seed: If non-None, will be used to set the random seed on created gym.Space instances.

    reset

     | reset() -> Union[List[np.ndarray], np.ndarray]
    

    Resets the state of the environment and returns an initial observation. Returns: observation (object/list): the initial observation of the space.

    step

     | step(action: List[Any]) -> GymStepResult
    

    Run one timestep of the environment's dynamics. When end of episode is reached, you are responsible for calling reset() to reset this environment's state. Accepts an action and returns a tuple (observation, reward, done, info).

    Arguments:

    • action object/list - an action provided by the environment

    Returns:

    • observation object/list - agent's observation of the current environment reward (float/list) : amount of reward returned after previous action
    • done boolean/list - whether the episode has ended.
    • info dict - contains auxiliary diagnostic information.

    render

     | render(mode="rgb_array")
    

    Return the latest visual observations. Note that it will not render a new frame of the environment.

    close

     | close() -> None
    

    Override _close in your subclass to perform any necessary cleanup. Environments will automatically close() themselves when garbage collected or when the program exits.

    seed

     | seed(seed: Any = None) -> None
    

    Sets the seed for this env's random number generator(s). Currently not implemented.

    ActionFlattener Objects

    class ActionFlattener()
    

    Flattens branched discrete action spaces into single-branch discrete action spaces.

    __init__

     | __init__(branched_action_space)
    

    Initialize the flattener.

    Arguments:

    • branched_action_space: A List containing the sizes of each branch of the action space, e.g. [2,3,3] for three branches with size 2, 3, and 3 respectively.

    lookup_action

     | lookup_action(action)
    

    Convert a scalar discrete action into a unique set of branched actions.

    Arguments:

    • action: A scalar value representing one of the discrete actions.

    Returns:

    The List containing the branched actions.

    In This Article
    Back to top
    Copyright © 2025 Unity Technologies — Trademarks and terms of use
    • Legal
    • Privacy Policy
    • Cookie Policy
    • Do Not Sell or Share My Personal Information
    • Your Privacy Choices (Cookie Settings)