Name: SantiagoC/palindrome-curriculum-v1 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: SantiagoC

Model Overview

SantiagoC/palindrome-curriculum-v1 is a 0.8 billion parameter causal language model, fine-tuned by SantiagoC from the SantiagoC/palindrome-sft-qwen3 base model. This iteration leverages the TRL framework for its training process.

Key Training Methodology

A significant aspect of this model's development is the integration of GRPO (Gradient-based Reward Policy Optimization). This method, introduced in the paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models", aims to enhance the model's mathematical reasoning abilities. The training was facilitated by ML Intern, an agent for machine learning research and development.

Intended Use Cases

Given its fine-tuning with the GRPO method, this model is particularly suited for:

Mathematical reasoning tasks: Excelling in problems that require logical and mathematical deduction.
Complex problem-solving: Applications where structured, step-by-step reasoning is beneficial.

Technical Details

Base Model: SantiagoC/palindrome-sft-qwen3
Training Framework: TRL (Transformers Reinforcement Learning)
Parameter Count: 0.8 billion
Context Length: 32768 tokens

This model provides a foundation for applications demanding robust mathematical and logical processing.

Overview

Model Overview

Key Training Methodology

Intended Use Cases

Technical Details

Full Model Card (README)