Name: Markshot/gemma-3-1b-it-Math-SFT-RS-DPO API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Markshot

Model Overview

Markshot/gemma-3-1b-it-Math-SFT-RS-DPO is a 1 billion parameter instruction-tuned model built upon the Gemma architecture. Developed by Markshot, this model has undergone specialized fine-tuning using Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) with Direct Preference Optimization (DPO).

Key Characteristics

Architecture: Based on the Gemma family of models.
Parameter Count: Features 1 billion parameters, offering a balance between performance and computational efficiency.
Context Length: Supports a substantial context window of 32768 tokens.
Fine-tuning: Utilizes SFT, RS (Reinforcement Learning from Human Feedback), and DPO for enhanced instruction following and performance.

Primary Focus

This model is specifically optimized for mathematical tasks and reasoning. Its training methodology aims to improve its ability to understand and solve complex mathematical problems, making it a suitable choice for applications requiring strong numerical and logical processing capabilities.

Limitations

The model card indicates that more information is needed regarding its specific biases, risks, and detailed performance metrics. Users should exercise caution and conduct thorough evaluations for their specific use cases, especially given the lack of detailed training data and evaluation results in the provided documentation.

Overview

Model Overview

Key Characteristics

Primary Focus

Limitations

Full Model Card (README)