ADRA-RL/tulu2-7b_aime_controlled_contamination_original
ADRA-RL/tulu2-7b_aime_controlled_contamination_original is a 7 billion parameter language model fine-tuned from allenai/tulu-2-7b. This model was trained using the TRL library with Supervised Fine-Tuning (SFT) methods. It is designed for general text generation tasks, leveraging its base model's capabilities for conversational AI and instruction following.
Loading preview...
Model Overview
ADRA-RL/tulu2-7b_aime_controlled_contamination_original is a 7 billion parameter language model derived from the allenai/tulu-2-7b base model. It has been specifically fine-tuned using the Transformer Reinforcement Learning (TRL) library, employing Supervised Fine-Tuning (SFT) techniques.
Key Capabilities
- Instruction Following: Inherits and refines the instruction-following capabilities of its
tulu-2-7bbase. - Text Generation: Proficient in generating coherent and contextually relevant text based on given prompts.
- Fine-tuned Performance: Benefits from targeted SFT training to enhance its performance on specific tasks, though the exact nature of the 'controlled contamination' is not detailed in the provided README.
Training Details
This model was trained using the following framework versions:
- TRL: 0.19.1
- Transformers: 4.51.1
- Pytorch: 2.6.0
- Datasets: 4.2.0
- Tokenizers: 0.21.4
Usage
Developers can quickly integrate this model using the transformers library's pipeline for text generation tasks, as demonstrated in the quick start example provided in the original model card.