irl_benchmark.utils package

Submodules

irl_benchmark.utils.general module

Utils module containing general helper functions.

irl_benchmark.utils.general.to_one_hot(hot_vals: Union[int, List[int], <sphinx.ext.autodoc.importer._MockObject object at 0x10f565eb8>], max_val: int, zeros_function: Callable = <sphinx.ext.autodoc.importer._MockObject object>) → Union[<sphinx.ext.autodoc.importer._MockObject object at 0x10f50f358>, <sphinx.ext.autodoc.importer._MockObject object at 0x10f555cc0>]

Convert an integer or a list of integers to a one-hot array.

hot_vals: Union[int, List[int], np.ndarray]
A single integer, or a list / vector of integers, corresponding to the hot values which will equal one in the returned array.
max_val: int
The maximum possible value in hot_values. All elements in hot_vals have to be smaller than max_val (since we start counting at 0).
zeros_function: Callable
Controls which function is used to create the array. It should be either numpy.zeros or torch.zeros.
Union[np.ndarray, torch.tensor]
Either a numpy array or torch tensor with the one-hot encoded values. Type of returned data structure depends on the passed zeros_function. The default is numpy array. The returned data structure will be of shape (1, max_value) if hot_vals is a single integer, and (len(hot_vals), max_value) otherwise.

irl_benchmark.utils.irl module

irl_benchmark.utils.rl module

Utils related to reinforcement learning.

irl_benchmark.utils.rl.true_reward_per_traj(trajs: List[Dict[str, list]]) → float

Return (undiscounted) average sum of true rewards per trajectory.

trajs: List[Dict[str, list]])
A list of trajectories. Each trajectory is a dictionary with keys [‘states’, ‘actions’, ‘rewards’, ‘true_rewards’, ‘features’]. The values of each dictionary are lists. See irl_benchmark.irl.collect.collect_trajs().
float
The undiscounted average sum of true rewards per trajectory.

irl_benchmark.utils.wrapper module

Utils module containing wrapper specific helper functions.

irl_benchmark.utils.wrapper.is_unwrappable_to(env: <sphinx.ext.autodoc.importer._MockObject object at 0x10f46e9e8>, to_wrapper: Type[<sphinx.ext.autodoc.importer._MockObject object at 0x10f46e710>]) → bool

Check if env can be unwrapped to to_wrapper.

env: gym.Env
A gym environment (potentially wrapped).
to_wrapper: Type[gym.Wrapper]
A wrapper class extending gym.Wrapper.
bool
True if env could be unwrapped to desired wrapper, False otherwise.
irl_benchmark.utils.wrapper.unwrap_env(env: <sphinx.ext.autodoc.importer._MockObject object at 0x10f50f6a0>, until_class: Union[None, <sphinx.ext.autodoc.importer._MockObject object at 0x10f50ff28>] = None) → <sphinx.ext.autodoc.importer._MockObject object at 0x10f50f400>

Unwrap wrapped env until we get an instance that is a until_class.

If until_class is None, env will be unwrapped until the lowest layer.

Module contents

A module with useful functions for the irl_benchmark framework.