ray.rllib.core.learner.learner.Learner.compile_results
ray.rllib.core.learner.learner.Learner.compile_results#
- Learner.compile_results(*, batch: ray.rllib.policy.sample_batch.MultiAgentBatch, fwd_out: Mapping[str, Any], loss_per_module: Mapping[str, Union[numpy.array, jnp.ndarray, tf.Tensor, torch.Tensor]], metrics_per_module: DefaultDict[str, Dict[str, Any]]) Mapping[str, Any][source]#
Compile results from the update in a numpy-friendly format.
- Parameters
batch – The batch that was used for the update.
fwd_out – The output of the forward train pass.
loss_per_module – A dict mapping module IDs (including ALL_MODULES) to the individual loss tensors as returned by calls to
compute_loss_for_module(module_id=...).metrics_per_module – The collected metrics defaultdict mapping ModuleIDs to metrics dicts. These metrics are collected during loss- and gradient computation, gradient postprocessing, and gradient application.
- Returns
A dictionary of results sub-dicts per module (including ALL_MODULES).