ray.rllib.utils.exploration.curiosity.Curiosity.__init__
ray.rllib.utils.exploration.curiosity.Curiosity.__init__#
- Curiosity.__init__(action_space: <MagicMock name='mock.Space' id='140492834250560'>, *, framework: str, model: ray.rllib.models.modelv2.ModelV2, feature_dim: int = 288, feature_net_config: Optional[dict] = None, inverse_net_hiddens: Tuple[int] = (256,), inverse_net_activation: str = 'relu', forward_net_hiddens: Tuple[int] = (256,), forward_net_activation: str = 'relu', beta: float = 0.2, eta: float = 1.0, lr: float = 0.001, sub_exploration: Optional[Union[Dict[str, Any], type, str]] = None, **kwargs)[source]#
Initializes a Curiosity object.
Uses as defaults the hyperparameters described in [1].
- Parameters
feature_dim – The dimensionality of the feature (phi) vectors.
feature_net_config – Optional model configuration for the feature network, producing feature vectors (phi) from observations. This can be used to configure fcnet- or conv_net setups to properly process any observation space.
inverse_net_hiddens – Tuple of the layer sizes of the inverse (action predicting) NN head (on top of the feature outputs for phi and phi’).
inverse_net_activation – Activation specifier for the inverse net.
forward_net_hiddens – Tuple of the layer sizes of the forward (phi’ predicting) NN head.
forward_net_activation – Activation specifier for the forward net.
beta – Weight for the forward loss (over the inverse loss, which gets weight=1.0-beta) in the common loss term.
eta – Weight for intrinsic rewards before being added to extrinsic ones.
lr – The learning rate for the curiosity-specific optimizer, optimizing feature-, inverse-, and forward nets.
sub_exploration – The config dict for the underlying Exploration to use (e.g. epsilon-greedy for DQN). If None, uses the FromSpecDict provided in the Policy’s default config.