trinity.buffer.schema.formatter module#
- class trinity.buffer.schema.formatter.ExperienceFormatter[源代码]#
基类:
ABC- abstractmethod format(sample: Dict) Experience[源代码]#
Format a raw sample dict into an experience.
- class trinity.buffer.schema.formatter.TaskFormatter(config: StorageConfig)[源代码]#
基类:
objectFormatter for task data.
Example Input:
{ "input": "Hello", "output": "Hi" }
- __init__(config: StorageConfig)[源代码]#
- class trinity.buffer.schema.formatter.SFTFormatter(tokenizer_path: str, format_config: FormatConfig)[源代码]#
-
Formatter for SFT data, supporting both message list and plaintext formats.
Uses format_config.prompt_type to distinguish between 'messages' and 'plaintext'.
Example input of MESSAGES:
{ "messages": [ {"role": "user", "content": "Hello, how are you?"}, {"role": "assistant", "content": "I'm fine, thank you!"} ] }
Example input of PLAINTEXT:
{ "system_prompt_key": "system", "prompt_key": "prompt", "response_key": "response", }
- __init__(tokenizer_path: str, format_config: FormatConfig)[源代码]#
- format(sample: Dict) Experience[源代码]#
Format a raw sample dict into an experience.
- class trinity.buffer.schema.formatter.DPOFormatter(tokenizer_path: str, format_config: FormatConfig)[源代码]#
-
Formatter for DPO plaintext data.
Example Input for PLAINTEXT:
{ "prompt": "What is your name?", "chosen": "My name is Assistant.", "rejected": "I don't have a name." }
Example Input for MESSAGES:
{ "messages": [ {"role": "user", "content": "What is your name?"}, ], "chosen": [ {"role": "assistant", "content": "My name is Assistant."}, ], "rejected": [ {"role": "assistant", "content": "I don't have a favorite color."} ] }
- __init__(tokenizer_path: str, format_config: FormatConfig)[源代码]#
- format(sample: Dict) Experience[源代码]#
Format a raw sample dict into an experience.