ray.rllib.core.learner.learner.Learner.compute_loss_for_module
ray.rllib.core.learner.learner.Learner.compute_loss_for_module#
- abstract Learner.compute_loss_for_module(*, module_id: str, hps: ray.rllib.core.learner.learner.LearnerHyperparameters, batch: ray.rllib.utils.nested_dict.NestedDict, fwd_out: Mapping[str, Union[numpy.array, jnp.ndarray, tf.Tensor, torch.Tensor]]) Union[numpy.array, jnp.ndarray, tf.Tensor, torch.Tensor][source]#
Computes the loss for a single module.
Think of this as computing loss for a single agent. For multi-agent use-cases that require more complicated computation for loss, consider overriding the
compute_lossmethod instead.- Parameters
module_id – The id of the module.
hps – The LearnerHyperparameters specific to the given
module_id.batch – The sample batch for this particular module.
fwd_out – The output of the forward pass for this particular module.
- Returns
A single total loss tensor. If you have more than one optimizer on the provided
module_idand would like to compute gradients separately using these different optimizers, simply add up the individual loss terms for each optimizer and return the sum. Also, for tracking the individual loss terms, you can use theLearner.register_metric(s)APIs.