Ray
Builds the Learner.
This method should be called before the learner is used. It is responsible for setting up the RLModule, optimizers, and (optionally) their lr-schedulers.