qgallouedec/gemma-3-27b-it-codeforces-SFT

Cold
Public
Vision
27B
FP8
32768
Hugging Face
Overview

Overview

This model, qgallouedec/gemma-3-27b-it-codeforces-SFT, is a specialized instruction-tuned large language model based on Google's 27 billion parameter Gemma-3-27b-it architecture. It has been fine-tuned using the open-r1/codeforces-cots dataset, which focuses on competitive programming problems and their solutions.

Key Capabilities

  • Competitive Programming: Optimized for understanding and generating solutions for algorithmic challenges, similar to those found on platforms like Codeforces.
  • Code-related Reasoning: Enhanced ability to process and reason about code snippets and problem descriptions.
  • Instruction Following: Benefits from the base Gemma-3-27b-it's instruction-following capabilities, further refined for technical problem-solving.

Training Details

The model was trained using the TRL (Transformer Reinforcement Learning) library, specifically employing Supervised Fine-Tuning (SFT). The training process utilized TRL version 0.16.0.dev0, Transformers 4.50.0.dev0, Pytorch 2.6.0, Datasets 3.0.0, and Tokenizers 0.21.0.

Good For

  • Assisting with competitive programming tasks.
  • Generating code solutions or explanations for algorithmic problems.
  • Developers and researchers working on code-centric LLM applications.