Name: tech27/Qwen2.5-1.5B-Instruct-Gensyn-Swarm-amphibious_spotted_kingfisher API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: tech27

Model Overview

This model, tech27/Qwen2.5-1.5B-Instruct-Gensyn-Swarm-amphibious_spotted_kingfisher, is a 1.5 billion parameter instruction-tuned language model. It is a fine-tuned iteration of the Gensyn/Qwen2.5-1.5B-Instruct base model.

Key Training Details

Fine-tuning Framework: The model was trained using the TRL library, a popular framework for Transformer Reinforcement Learning.
Training Method: A notable aspect of its training is the application of GRPO (Gradient Regularized Policy Optimization). This method, introduced in the paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" (arXiv:2402.03300), is specifically designed to enhance mathematical reasoning abilities in language models.

Intended Use

Given its fine-tuning with the GRPO method, this model is particularly well-suited for applications that require strong mathematical reasoning and problem-solving capabilities. Developers can leverage its enhanced mathematical understanding for tasks where precise numerical or logical deductions are critical.

Overview

Model Overview

Key Training Details

Intended Use

Full Model Card (README)