kth8/gemma-3-1b-it-SuperGPQA-Classifier
This model is a fine-tuned Gemma 3.1B-IT variant, developed by kth8, specifically optimized for classification tasks. With nearly 1 billion parameters (999,885,952), it excels at categorizing problems into predefined disciplines, fields, and subfields. It was fine-tuned on the m-a-p/SuperGPQA dataset, making it highly effective for structured content classification.
Loading preview...
Model Overview
kth8/gemma-3-1b-it-SuperGPQA-Classifier is a specialized language model, fine-tuned from unsloth/gemma-3-1b-it, designed for precise classification of problems. It leverages the Gemma 3.1B-IT architecture, featuring 999,885,952 parameters, and operates with torch.bfloat16 precision.
Key Capabilities
- Problem Classification: Expertly categorizes given problems into specific disciplines, fields, and subfields.
- Structured Output: Provides classification results in a JSON format, making it easy for programmatic integration.
- Optimized for Accuracy: Fine-tuned on the
m-a-p/SuperGPQAdataset, enhancing its ability to accurately map diverse problem statements to a comprehensive set of categories.
Training Details
The model underwent 2 epochs of supervised fine-tuning (SFT) with a batch size of 32 and a learning rate of 0.0004, utilizing the adamw_torch_fused optimizer. It was trained using PEFT (Parameter-Efficient Fine-Tuning) with a LoRA rank of 32 and alpha of 64, targeting q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, and down_proj modules. Training achieved a low average loss of 0.0905 and a final validation loss of 0.0563.
Good For
- Automated content categorization.
- Structuring large datasets of questions or problems.
- Applications requiring precise, JSON-formatted classification of textual input.