Overview
agi-css/better-base is a 7 billion parameter instruction-tuned language model built upon the LLaMA architecture. Its core innovation lies in the Stable Alignment project, which aims to create socially aligned language models by directly training them on simulated social games, rather than relying on an additional reward model. This approach is presented as an efficient and stable alternative to traditional Reinforcement Learning from Human Feedback (RLHF).
Key Capabilities & Training
- Stable Alignment: Utilizes a unique training procedure that involves direct training on social games to embed social norms and improve alignment.
- Enhanced Instruction Tuning: Improves upon standard instruction tuning by incorporating higher quality data from AlpacaDataCleaned, which corrects errors found in the original Alpaca dataset.
- Code Pretraining: Includes pretraining with the codealpaca dataset, enhancing its capabilities in code-related tasks.
- LLaMA-based: Leverages the foundational strengths of the LLaMA model family.
Limitations and Considerations
While designed for better social alignment, the model may still exhibit biases or generate inappropriate content due to inherent biases in its training data. It is recommended that users conduct a thorough assessment of safety and fairness before deploying the model in any application.