miulab/llama2-7b-alpaca-sft-10k
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kLicense:apache-2.0Architecture:Transformer Open Weights Cold

The miulab/llama2-7b-alpaca-sft-10k is a 7 billion parameter Llama 2-based language model developed by miulab, fine-tuned using Supervised Fine-Tuning (SFT) on 10,000 Alpaca-style instructions. This model serves as the backbone SFT model for the DogeRM research, focusing on equipping reward models with domain knowledge through model merging. It is designed for research into reward model development and domain-specific knowledge integration.

Loading preview...