ray.rllib.utils.exploration.epsilon_greedy.EpsilonGreedy
ray.rllib.utils.exploration.epsilon_greedy.EpsilonGreedy#
- class ray.rllib.utils.exploration.epsilon_greedy.EpsilonGreedy(action_space: <MagicMock name='mock.spaces.Space' id='140494124151808'>, *, framework: str, initial_epsilon: float = 1.0, final_epsilon: float = 0.05, warmup_timesteps: int = 0, epsilon_timesteps: int = 100000, epsilon_schedule: Optional[ray.rllib.utils.schedules.schedule.Schedule] = None, **kwargs)[source]#
Bases:
ray.rllib.utils.exploration.exploration.ExplorationEpsilon-greedy Exploration class that produces exploration actions.
When given a Model’s output and a current epsilon value (based on some Schedule), it produces a random action (if rand(1) < eps) or uses the model-computed one (if rand(1) >= eps).
Methods
__init__(action_space, *, framework[, ...])Create an EpsilonGreedy exploration class.
before_compute_actions(*[, timestep, ...])Hook for preparations before policy.compute_actions() is called.
get_exploration_optimizer(optimizers)May add optimizer(s) to the Policy's own
optimizers.on_episode_end(policy, *[, environment, ...])Handles necessary exploration logic at the end of an episode.
on_episode_start(policy, *[, environment, ...])Handles necessary exploration logic at the beginning of an episode.
postprocess_trajectory(policy, sample_batch)Handles post-processing of done episode trajectories.