koutch/short_paper_llama_0.json_train_grpo_v3_dev
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Jan 5, 2026License:apache-2.0Architecture:Transformer Open Weights Cold
The koutch/short_paper_llama_0.json_train_grpo_v3_dev is an 8 billion parameter Llama 3.1 model, fine-tuned by koutch. This model was trained using Unsloth and Huggingface's TRL library, enabling a 2x faster fine-tuning process. It is optimized for tasks benefiting from efficient training and the Llama 3.1 architecture.
Loading preview...