Name: cjiao/goldengoose-method-v2-bm25-100 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: cjiao

Model Overview

The cjiao/goldengoose-method-v2-bm25-100 is a 1.5 billion parameter language model, fine-tuned from the Qwen/Qwen2.5-1.5B-Instruct base model. Its development utilized the TRL framework and incorporated the GRPO (Gradient-based Reward Policy Optimization) training method. GRPO is a technique specifically introduced to improve mathematical reasoning in large language models, as detailed in the research paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models".

Key Capabilities

Enhanced Mathematical Reasoning: Specialized training with GRPO aims to improve the model's ability to handle mathematical problems and logical deductions.
Instruction Following: As a fine-tuned instruction model, it is designed to respond effectively to user prompts and instructions.
Context Handling: Supports a substantial context length of 32768 tokens, allowing for processing and reasoning over longer inputs.

Training Details

The model's training procedure leveraged the TRL library (version 0.19.1) and was conducted using Transformers (4.57.6), Pytorch (2.5.1), Datasets (4.8.4), and Tokenizers (0.22.2). The GRPO method, central to its training, is documented in the DeepSeekMath paper.

Use Cases

This model is particularly well-suited for applications requiring strong mathematical problem-solving, logical reasoning, and accurate instruction following within a substantial context window. Its specialized training makes it a candidate for tasks where numerical and logical precision are critical.

Overview

Model Overview

Key Capabilities

Training Details

Use Cases

Full Model Card (README)