trinity.trainer.verl.config module

trinity.trainer.verl.config module#

veRL 0.8 configuration builder for Trinity-RFT.

This module provides build_verl_config(), the single entry point that converts Trinity’s Config (global_config) into the DictConfig required by veRL 0.8’s ActorRolloutRefWorker and TrainingWorker.

Design principle (P1):
  • VERLTrainer uses global_config for all Trinity-level logic.

  • build_verl_config() is called once to produce the minimal DictConfig needed at the Worker/Engine boundary.

  • Only fields that veRL workers/engines actually consume are included.

The DictConfig structure must match what ActorRolloutRefWorker.__init__ and ActorRolloutRefWorker.init_model() expect. Every nested section that corresponds to a BaseConfig subclass must contain a _target_ field pointing to the fully-qualified Python class path, because omega_conf_to_dataclass() (Mode 1 — no dataclass_type argument) uses hydra.utils.instantiate() which requires _target_ for recursive instantiation.

Config sections and their target types:

config.model → HFModelConfig config.actor → FSDPActorConfig | McoreActorConfig config.ref → FSDPActorConfig | McoreActorConfig (subset) config.rollout → RolloutConfig config.critic → FSDPCriticConfig | McoreCriticConfig config.global_profiler → dict (plain dict, not a dataclass)

trinity.trainer.verl.config.build_verl_config(global_config: Config) DictConfig[source]#

Build the veRL 0.8 DictConfig from Trinity’s global Config.

This produces the minimal DictConfig that ActorRolloutRefWorker and TrainingWorker need. All Trinity-level logic (algorithm, advantage, KL penalty, etc.) stays in VERLTrainer using global_config.

The resulting DictConfig has the following top-level keys:
  • model: HFModelConfig fields

  • actor: FSDPActorConfig or McoreActorConfig fields

  • ref: actor-style config for reference model

  • rollout: RolloutConfig fields

  • critic: CriticConfig fields (for VERLTrainer._init_workers)

  • global_profiler: plain dict