anonymousML123/Llama-3.1-8B-Tulu10pct-SFT-MAHALS
anonymousML123/Llama-3.1-8B-Tulu10pct-SFT-MAHALS is an 8 billion parameter Llama 3.1 model, supervised fine-tuned on 10% of the Tulu-3 SFT mixture. Developed by anonymousML123 for the MAHALS research project, this model is specifically intended for research on multi-agent alignment and instruction following. It offers a base for exploring instruction-tuned capabilities within a constrained dataset environment.
Loading preview...
Model Overview
This model, anonymousML123/Llama-3.1-8B-Tulu10pct-SFT-MAHALS, is an 8 billion parameter Llama 3.1 variant that has undergone Supervised Fine-Tuning (SFT). It was trained using a 10% random sample (approximately 94,000 examples) from the allenai/tulu-3-sft-mixture dataset, which includes diverse instruction-following data such as FLAN v2, Open Assistant, ShareGPT, code instructions, and math instructions.
Key Characteristics
- Base Model: Meta's Llama 3.1 8B.
- Training: Supervised Fine-Tuning (SFT) using the
allenai/open-instructframework. - Dataset: A subset of the Tulu-3 SFT mixture, focusing on instruction-following tasks.
- Context Length: Supports a maximum sequence length of 4096 tokens during training.
- Intended Use: Primarily for research purposes, specifically within the MAHALS (Multi-Agent Hierarchical Alignment) project, to study multi-agent alignment and instruction following.
Limitations and Considerations
- Reduced Capability: Due to training on only 10% of the full Tulu-3 dataset, its capabilities may be reduced compared to models trained on the complete dataset.
- Language Support: English only.
- Bias: May exhibit biases present in its training data.
- Production Readiness: Not recommended for production environments without further comprehensive evaluation.
Inference Requirements
- BF16/FP16: Requires approximately 20 GB VRAM.
- INT8: Requires approximately 10 GB VRAM.
- INT4: Requires approximately 6 GB VRAM, making it accessible on consumer GPUs.