swadeshb/Llama-3.2-3B-Instruct-AMPO-V1
TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Dec 18, 2025Architecture:Transformer Warm

swadeshb/Llama-3.2-3B-Instruct-AMPO-V1 is a 3.2 billion parameter instruction-tuned causal language model, fine-tuned by swadeshb from the Meta Llama-3.2-3B-Instruct base model. This model was trained using the GRPO method, which is designed to enhance mathematical reasoning capabilities. It is optimized for tasks requiring improved logical and mathematical problem-solving, making it suitable for applications where precise reasoning is crucial.

Loading preview...