trinity.trainer.verl.monkey_patch module

trinity.trainer.verl.monkey_patch module#

trinity.trainer.verl.monkey_patch.apply_monkey_patch(model: PreTrainedModel, ulysses_sp_size: int = 1, use_remove_padding: bool = True, use_fused_kernels: bool = False, fused_kernels_backend: str = None, use_tiled_mlp: bool = False, tiled_mlp_shards: int = 4)[source]#

Apply monkey patch to the models for ulysses sequence parallel, fused kernel, and tiled MLP.

In the end of this function forward function of the model is patched for fused kernel. If the model is not supported with fused kernel, please return after patch.

Parameters:
  • model – The model to apply the monkey patch.

  • ulysses_sp_size – The size of ulysses sequence parallel.

  • use_remove_padding – Whether to use remove padding.

  • use_fused_kernels – Whether to use fused kernels.

  • fused_kernels_backend – The backend to use for fused kernels.

  • use_tiled_mlp – Whether to use TiledMLP for memory-efficient MLP computation.

  • tiled_mlp_shards – Number of shards for TiledMLP (higher = lower memory, slightly slower).