trinity.common.config_validator module#
- class trinity.common.config_validator.ConfigValidator[source]#
Bases:
ABCAbstract base class for configuration validators.
Each validator is responsible for checking and potentially modifying specific aspects of the global configuration to ensure validity, set defaults, or handle deprecated settings.
- class trinity.common.config_validator.DeprecatedConfigValidator[source]#
Bases:
ConfigValidatorValidator for handling deprecated configuration options.
Issues warnings when deprecated configuration parameters are used and suggests their replacements.
- class trinity.common.config_validator.GlobalConfigValidator[source]#
Bases:
ConfigValidatorValidator for global configuration settings.
Handles validation of the main operating mode, sets up checkpoint directories, and configures logging paths. Manages experiment naming conflicts by appending timestamps to avoid overwriting existing experiments.
- validate(config: Config) None[source]#
Validate global configuration settings and set up directory structure.
Validates that the mode is one of the supported values
Creates absolute checkpoint paths and handles experiment naming conflicts
Sets up the log directory path
- Parameters:
config β The global configuration object to validate.
- Raises:
ValueError β If an invalid mode is specified.
- class trinity.common.config_validator.RayClusterConfigValidator[source]#
Bases:
ConfigValidatorValidator for Ray cluster configuration.
Handles Ray cluster setup including namespace configuration, automatic detection of cluster resources (node count and GPUs per node), and GPU allocation validation based on the current operating mode and model requirements.
- validate(config: Config) None[source]#
Validate and configure Ray cluster settings.
Sets the Ray namespace if not provided
Skips validation if Tinker is enabled
Automatically detects cluster information if not provided
Validates GPU allocation based on mode and model requirements
- Parameters:
config β The global configuration object to validate.
- Raises:
RuntimeError β If no alive nodes are found in the Ray cluster.
ValueError β If GPU allocation requirements cannot be satisfied.
- class trinity.common.config_validator.AlgorithmConfigValidator[source]#
Bases:
ConfigValidatorValidator for algorithm-specific configuration.
Handles algorithm type validation, sets default configuration parameters, validates function registry entries, and manages deprecated optimizer settings.
- validate(config: Config) None[source]#
Validate and configure algorithm-specific settings.
Validates the algorithm type and runs algorithm-specific validation
Sets default configuration values for various algorithm components
Validates and configures function registry entries (loss functions, etc.)
Handles deprecated optimizer configuration parameters
- Parameters:
config β The global configuration object to validate.
- Raises:
ValueError β If invalid algorithm types or function names are specified.
- class trinity.common.config_validator.ModelConfigValidator[source]#
Bases:
ConfigValidatorValidator for model configuration settings.
Handles model path validation, chat template loading, Tinker-specific validation, and model length parameter validation including prompt/response token limits.
- validate(config: Config) None[source]#
Validate and configure model-specific settings.
Sets critic model path to actor model path if not specified
Loads chat templates from file if path is provided
Validates Tinker-specific configuration if enabled
Validates and sets model length parameters (max_model_len, max_prompt_tokens, etc.)
- Parameters:
config β The global configuration object to validate.
- Raises:
ValueError β If chat template file cannot be read, model length constraints are violated, or Tinker configuration is invalid.
- class trinity.common.config_validator.ExplorerConfigValidator[source]#
Bases:
ConfigValidatorValidator for explorer configuration settings.
Handles rollout model configuration inheritance, auxiliary model validation, over-rollout ratio validation, and LoRA configuration processing.
- validate(config: Config) None[source]#
Validate and configure explorer-specific settings.
Inherits model configuration from the global model config to rollout models
Validates auxiliary model configurations
Validates over-rollout ratio settings and compatibility with sync style
Processes LoRA configurations including dummy LoRA creation
- Parameters:
config β The global configuration object to validate.
- Raises:
ValueError β If auxiliary models lack model paths, over-rollout ratio is invalid, or multiple LoRA adapters are configured.
- class trinity.common.config_validator.SynchronizerConfigValidator[source]#
Bases:
ConfigValidatorValidator for synchronizer configuration settings.
Handles synchronizer namespace configuration and validates NCCL synchronization compatibility with different modes and features.
- validate(config: Config) None[source]#
Validate and configure synchronizer settings.
Sets the Ray namespace for the synchronizer
Sets the explorer world size based on rollout GPU count
Disables NCCL synchronization for incompatible modes and features
- Parameters:
config β The global configuration object to validate.
- class trinity.common.config_validator.IntervalConfigValidator[source]#
Bases:
ConfigValidatorValidator for interval configuration settings.
Validates synchronization and evaluation intervals, ensuring that evaluation intervals are multiples of synchronization intervals when applicable.
- validate(config: Config) None[source]#
Validate interval configuration settings.
Ensures synchronization interval is positive
Adjusts evaluation interval to be a multiple of sync interval when needed
- Parameters:
config β The global configuration object to validate.
- Raises:
AssertionError β If synchronization interval is not positive.
- class trinity.common.config_validator.MonitorConfigValidator[source]#
Bases:
ConfigValidatorValidator for monitor configuration settings.
Validates monitor type, sets default arguments, and configures monitor cache directory.
- validate(config: Config) None[source]#
Validate and configure monitor settings.
Validates that the monitor type is supported
Sets default monitor arguments if not provided
Creates the monitor cache directory
- Parameters:
config β The global configuration object to validate.
- Raises:
ValueError β If an invalid monitor type is specified.
- class trinity.common.config_validator.BufferConfigValidator[source]#
Bases:
ConfigValidatorValidator for buffer configuration settings.
Handles train batch size validation, buffer directory setup, tokenizer configuration, and comprehensive validation of explorer/trainer input configurations including tasksets, experience buffers, and data pipelines.
- validate(config: Config) None[source]#
Validate and configure buffer settings.
Sets train batch size based on mode and algorithm configuration
Creates buffer cache directory
Configures pad token ID using tokenizer
Validates explorer input configurations (tasksets, selectors)
Validates trainer input configurations (experience buffers, auxiliary buffers)
Validates data processor pipeline configurations
- Parameters:
config β The global configuration object to validate.
- Raises:
ValueError β If required buffer configurations are missing or invalid.
RuntimeError β If buffer directory creation fails.
- class trinity.common.config_validator.TrainerConfigValidator[source]#
Bases:
ConfigValidatorValidator for trainer configuration settings.
Handles trainer type validation, configuration merging, and parameter validation for different trainer implementations (veRL, Tinker, etc.).
- validate(config: Config) None[source]#
Validate and configure trainer settings.
Validates trainer type and handles configuration for different trainer types
Merges trainer configuration with schema defaults
Validates save checkpoint strategy options
Synchronizes trainer configuration with global config
- Parameters:
config β The global configuration object to validate.
- Raises:
ValueError β If trainer type is invalid, deprecated config path is used, or save checkpoint strategy is invalid.
- class trinity.common.config_validator.GPUMemoryValidator[source]#
Bases:
ConfigValidatorValidator for GPU memory settings.
Checks GPU memory usage and suggests changes to configuration settings.
Note
This validator is disabled when ignore_validator_suggestions is set to True.
The coefficients of the following formulas are roughly estimated using the torch.profile tool and may not be accurate.
- validate(config: Config) None[source]#
Validate GPU memory usage based on the provided configuration.
Skips validation if suggestions are disabled or if model tinker mode is enabled. Only runs memory validation for βtrainβ or βbothβ modes.
- Parameters:
config (Config) β The global configuration object.
- validate_trainer_memory_usage(config: Config) None[source]#
Perform GPU memory validation for trainer components.
Detects CUDA availability and delegates to FSDP-specific checks if applicable.
- Parameters:
config (Config) β The global configuration object.
- fsdp_memory_check(config: Config) None[source]#
Perform comprehensive FSDP memory validation for actor and critic models.
Estimates total GPU memory usage including parameters, optimizer states, and activations. Issues warnings and suggestions if usage exceeds safe thresholds.
- Parameters:
config (Config) β The global configuration object.
- Raises:
ValueError β If estimated memory usage exceeds safe limits and suggestions are not bypassed.