kmseong/Llama3.2-3B-gsm8k-fullft-atfter-ssft
The kmseong/Llama3.2-3B-gsm8k-fullft-atfter-ssft model is a 3.2 billion parameter Llama 3.2 Instruct variant developed by kmseong, specifically fine-tuned using Full Parameter Fine-tuning on the GSM8K dataset. This model is optimized for mathematical reasoning, particularly grade school math problems, and offers improved performance over LoRA-based fine-tunes for this specific domain. It is designed for direct use without adapter libraries, making it suitable for applications requiring dedicated math problem-solving capabilities.
Loading preview...
Overview
This model, developed by kmseong, is a Llama 3.2 3B Instruct variant that has undergone Full Parameter Fine-tuning on the GSM8K dataset. Unlike LoRA-based fine-tunes, all 3 billion parameters of the model were updated during training, leading to potentially better performance for its specific task. It is a causal language model trained using Transformers and TRL (SFTTrainer).
Key Capabilities
- Mathematical Reasoning: Specifically fine-tuned for solving grade school math problems (GSM8K).
- Full Parameter Fine-tuning: All model weights were updated, offering a complete, standalone model without the need for PEFT libraries.
- Direct Use: Can be used directly for inference without additional adapters.
Performance & Limitations
On the GSM8K test set, the model achieved an accuracy of 40.00% (20/50 samples). While this model is optimized for GSM8K, it may not generalize well to other mathematical domains or non-math tasks. It requires a GPU for inference, with 16GB+ VRAM recommended due to its larger file size (~6GB) compared to LoRA adapters. The training involved 7,473 samples over 3 epochs with a max context length of 512 tokens.