ray.rllib.core.learner.learner.Learner.compile_results#

Learner.compile_results(*, batch: ray.rllib.policy.sample_batch.MultiAgentBatch, fwd_out: Mapping[str, Any], loss_per_module: Mapping[str, Union[numpy.array, jnp.ndarray, tf.Tensor, torch.Tensor]], metrics_per_module: DefaultDict[str, Dict[str, Any]]) Mapping[str, Any][source]#

Compile results from the update in a numpy-friendly format.

Parameters
  • batch – The batch that was used for the update.

  • fwd_out – The output of the forward train pass.

  • loss_per_module – A dict mapping module IDs (including ALL_MODULES) to the individual loss tensors as returned by calls to compute_loss_for_module(module_id=...).

  • metrics_per_module – The collected metrics defaultdict mapping ModuleIDs to metrics dicts. These metrics are collected during loss- and gradient computation, gradient postprocessing, and gradient application.

Returns

A dictionary of results sub-dicts per module (including ALL_MODULES).