trinity.utils.metrics module#
Unified metrics aggregation utilities for Trinity-RFT.
Metric keys may carry an aggregation-type suffix in the form name:agg.
Supported suffixes: :mean, :sum, :max, :min, :last.
Keys without a suffix default to mean aggregation.
- class trinity.utils.metrics.AggType(*values)[source]#
Bases:
str,Enum- MEAN = 'mean'#
- SUM = 'sum'#
- MAX = 'max'#
- MIN = 'min'#
- LAST = 'last'#
- trinity.utils.metrics.group_numeric_metrics(metric_dicts: List[Dict[str, float]]) Dict[Tuple[str, AggType], List[float]][source]#
- trinity.utils.metrics.group_metrics_by_canonical_key(metric_dicts: List[Dict[str, float]]) Dict[str, Tuple[AggType, List[float]]][source]#
- trinity.utils.metrics.parse_metric_key(key: str) Tuple[str, AggType][source]#
Parse a metric key into (name, aggregation_type).
Examples
ârewardâ -> (ârewardâ, AggType.MEAN) âexperience_count:sumâ -> (âexperience_countâ, AggType.SUM) âmodel_version:lastâ -> (âmodel_versionâ, AggType.LAST) âsome:unknown_suffixâ -> (âsome:unknown_suffixâ, AggType.MEAN)
- trinity.utils.metrics.aggregate_metrics(metric_dicts: List[Dict[str, float]], prefix: str = '', default_output_stats: List[str] | None = None) Dict[str, float][source]#
Aggregate a list of metric dictionaries respecting per-key aggregation types.
For keys with AggType.MEAN, outputs
{prefix}/{name}/mean,/max,/min(controlled by default_output_stats). For AggType.SUM, outputs{prefix}/{name}/sum. For AggType.MAX, outputs{prefix}/{name}/max. For AggType.MIN, outputs{prefix}/{name}/min. For AggType.LAST, outputs{prefix}/{name}/last.- Parameters:
metric_dicts â List of flat metric dictionaries (values must be numeric).
prefix â Optional prefix prepended as
{prefix}/{name}/....default_output_stats â Stats to output for MEAN metrics. Defaults to [âmeanâ, âmaxâ, âminâ].
- Returns:
Flat dictionary of aggregated metrics ready for monitor logging.
- trinity.utils.metrics.aggregate_eval_metrics(metric_dicts: List[Dict[str, float]], prefix: str = '', output_stats: List[str] | None = None, detailed_stats: bool = False) Dict[str, float][source]#
Aggregate eval metrics with optional detailed statistics.
- For MEAN metrics:
If detailed_stats=True: output mean/max/min/std per the output_stats list.
If detailed_stats=False: output only the mean value as
{prefix}/{name}.
For non-MEAN metrics: same behavior as aggregate_metrics.
- trinity.utils.metrics.aggregate_run_level_metrics(metric_dicts: List[Dict[str, float]]) Dict[str, float][source]#
Aggregate experience-level metrics into a single run-level metric dict.
Unlike batch-level aggregation, this preserves the original key format (with
:aggsuffix if present) so that downstream task/batch aggregation can still see the aggregation type annotation.- Aggregation rules:
MEAN keys: averaged across experiences
SUM keys: summed across experiences
MAX keys: max across experiences
MIN keys: min across experiences
LAST keys: last value