agi-css/better-base

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kLicense:apache-2.0Architecture:Transformer0.0K Open Weights Cold

agi-css/better-base is a 7 billion parameter instruction-tuned language model based on LLaMA, developed by agi-css. It is distinguished by its novel training approach, Stable Alignment, which directly trains the model on social games to achieve alignment, bypassing traditional reward models. This method aims to provide an efficient and stable alternative to RLHF, making it suitable for applications requiring socially aligned language generation.

Loading preview...

Overview

agi-css/better-base is a 7 billion parameter instruction-tuned language model built upon the LLaMA architecture. Its core innovation lies in the Stable Alignment project, which aims to create socially aligned language models by directly training them on simulated social games, rather than relying on an additional reward model. This approach is presented as an efficient and stable alternative to traditional Reinforcement Learning from Human Feedback (RLHF).

Key Capabilities & Training

  • Stable Alignment: Utilizes a unique training procedure that involves direct training on social games to embed social norms and improve alignment.
  • Enhanced Instruction Tuning: Improves upon standard instruction tuning by incorporating higher quality data from AlpacaDataCleaned, which corrects errors found in the original Alpaca dataset.
  • Code Pretraining: Includes pretraining with the codealpaca dataset, enhancing its capabilities in code-related tasks.
  • LLaMA-based: Leverages the foundational strengths of the LLaMA model family.

Limitations and Considerations

While designed for better social alignment, the model may still exhibit biases or generate inappropriate content due to inherent biases in its training data. It is recommended that users conduct a thorough assessment of safety and fairness before deploying the model in any application.