jerrycheng233/model1_sft_16bit
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Mar 12, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

jerrycheng233/model1_sft_16bit is an 8 billion parameter Llama-based model developed by jerrycheng233, fine-tuned from unsloth/DeepSeek-R1-Distill-Llama-8B. This model was trained using Unsloth and Huggingface's TRL library, enabling faster training. It is designed for general language understanding and generation tasks, leveraging its Llama architecture for broad applicability.

Loading preview...