trinity.explorer package#
Subpackages#
Submodules#
- trinity.explorer.explorer module
ExplorerExplorer.__init__()Explorer.setup_weight_sync_group()Explorer.setup_model_level_weight_sync_group()Explorer.prepare()Explorer.get_weight()Explorer.explore()Explorer.explore_step()Explorer.finish_current_steps()Explorer.need_sync()Explorer.need_eval()Explorer.eval()Explorer.benchmark()Explorer.save_checkpoint()Explorer.sync_weight()Explorer.shutdown()Explorer.is_alive()Explorer.serve()Explorer.get_actor()
- trinity.explorer.rollout_coordinator module
BatchLifecycleStateBatchStateRolloutCoordinatorRolloutCoordinator.__init__()RolloutCoordinator.prepare()RolloutCoordinator.shutdown()RolloutCoordinator.submit_batch()RolloutCoordinator.finalize_train_batch()RolloutCoordinator.finalize_eval_batch()RolloutCoordinator.abort_batch()RolloutCoordinator.process_experiences()RolloutCoordinator.get_actor()
- trinity.explorer.scheduler module
TaskWrapperCompletedTaskResultRunningTaskStatebootstrap_metric()calculate_task_level_metrics()RunnerWrappersort_batch_id()SchedulerScheduler.__init__()Scheduler.task_done_callback()Scheduler.discard_completed_results()Scheduler.start()Scheduler.stop()Scheduler.schedule()Scheduler.dynamic_timeout()Scheduler.drain_batch_payload_results()Scheduler.get_payload_results()Scheduler.get_statuses()Scheduler.abort_batch()Scheduler.has_step()Scheduler.wait_all()Scheduler.get_key_state()Scheduler.get_runner_state()Scheduler.get_all_state()Scheduler.print_all_state()
- trinity.explorer.workflow_runner module
Module contents#
Explorer package exports.
- class trinity.explorer.Explorer(config: Config)[源代码]#
基类:
objectResponsible for exploring the taskset.
- async explore() str[源代码]#
- The timeline of the exploration process:
- <--------------------------------- one period -------------------------------------> |
- explorer | <---------------- step_1 --------------> | |
- | <---------------- step_2 --------------> | |... || <---------------- step_n ---------------> | || <---------------------- eval --------------------> | <-- sync --> |
|--------------------------------------------------------------------------------------|
trainer | <-- idle --> | <-- step_1 --> | <-- step_2 --> | ... | <-- step_n --> | <-- sync --> |
- async get_weight(name: str) Tensor[源代码]#
Get the weight of the loaded model (For checkpoint weights update).
- async serve() None[源代码]#
Run the explorer in serving mode.
In serving mode, the explorer starts an OpenAI compatible server to handle requests. Agent applications can be deployed separately and interact with the explorer via the API.
import openai client = openai.OpenAI( base_url=f"{explorer_server_url}/v1", api_key="EMPTY", ) response = client.chat.completions.create( model=config.model.model_path, messages=[{"role": "user", "content": "Hello!"}] )
- async setup_model_level_weight_sync_group()[源代码]#
Setup process group for each model, only used in serve mode.
- class trinity.explorer.RolloutCoordinator(config: Config, rollout_model: List[InferenceModel], auxiliary_models: List[List[InferenceModel]] | None = None)[源代码]#
基类:
objectOwn scheduler-side batch state and expose batch-level finalize APIs.
- __init__(config: Config, rollout_model: List[InferenceModel], auxiliary_models: List[List[InferenceModel]] | None = None)[源代码]#
Create a coordinator with internally managed scheduler and pipeline.
- async abort_batch(batch_id: int | str, *, reason: str, keep_partial_results: bool = False) None[源代码]#
Abort one batch and cleanup its running and staged state.
- async finalize_eval_batch(batch_id: str, *, timeout: float | None = None) dict[源代码]#
Finalize one eval batch and return aggregated eval metrics.
- async finalize_train_batch(batch_id: int, *, timeout: float | None = None) dict[源代码]#
Finalize one train batch and return aggregated metrics.
- classmethod get_actor(config: Config, models: List, auxiliary_models: List) ActorHandle[源代码]#
Init rollout coordinator for the task-event-completion path.