burtenshaw/GemmaCoder3-12B

Hugging Face
VISIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Mar 18, 2025Architecture:Transformer0.1K Warm

burtenshaw/GemmaCoder3-12B is a 12 billion parameter language model fine-tuned from google/gemma-3-12b-it. This model specializes in code generation, demonstrating improved performance on the LiveCodeBench benchmark. It is optimized for tasks requiring code understanding and generation, making it suitable for competitive programming and similar applications.

Loading preview...

burtenshaw/GemmaCoder3-12B Overview

This model is a 12 billion parameter variant of the Gemma-3 family, specifically fine-tuned from google/gemma-3-12b-it. Its training utilized the open-r1/codeforces-cots dataset and the TRL framework, indicating a focus on code-related tasks.

Key Capabilities & Performance

The primary differentiator of GemmaCoder3-12B is its enhanced performance in code generation, as evidenced by a significant improvement on the LiveCodeBench benchmark, achieving 32.9% compared to the base Gemma3-12B-it's 21.9%. While excelling in code, the model shows a slight decrease in MMLU performance (61.0% vs 69.5%) but maintains competitive scores on Winogrande (63.9%) and HellaSwag (54.0%).

Use Cases

  • Code Generation: Ideal for applications requiring the generation of programming code.
  • Competitive Programming: Its fine-tuning on a Codeforces dataset suggests suitability for solving algorithmic problems.
  • Code Understanding: Can be leveraged for tasks involving the interpretation and analysis of code snippets.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p