kmseong/Llama3.2-3B-gsm8k-fullft-atfter-ssft
TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Mar 3, 2026License:llama3.2Architecture:Transformer Warm

The kmseong/Llama3.2-3B-gsm8k-fullft-atfter-ssft model is a 3.2 billion parameter Llama 3.2 Instruct variant developed by kmseong, specifically fine-tuned using Full Parameter Fine-tuning on the GSM8K dataset. This model is optimized for mathematical reasoning, particularly grade school math problems, and offers improved performance over LoRA-based fine-tunes for this specific domain. It is designed for direct use without adapter libraries, making it suitable for applications requiring dedicated math problem-solving capabilities.

Loading preview...