Name: anujjamwal/hcot-qwen2.5-math-1.5b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: anujjamwal

Model Overview

The anujjamwal/hcot-qwen2.5-math-1.5b is a 1.5 billion parameter language model, fine-tuned by anujjamwal. It is built upon the robust Qwen/Qwen2.5-Math-1.5B architecture, which is inherently designed for mathematical tasks. This specific iteration has undergone further fine-tuning, aiming to enhance its performance in mathematical reasoning and problem-solving.

Key Characteristics

Base Model: Qwen2.5-Math-1.5B, known for its mathematical capabilities.
Parameter Count: 1.5 billion parameters, offering a balance between performance and computational efficiency.
Context Length: Supports a substantial context window of 32768 tokens, enabling it to process and understand longer mathematical problems or sequences.
Fine-tuning Focus: Optimized for mathematical reasoning, suggesting improved accuracy and understanding in this domain.

Training Details

The model was trained with specific hyperparameters, including a learning rate of 2e-05, a total batch size of 8 (train_batch_size: 2, gradient_accumulation_steps: 4), and 10 epochs. It utilized the AdamW_Torch_Fused optimizer and a linear learning rate scheduler with 0.1 warmup steps. The training was conducted using Transformers 5.0.0, Pytorch 2.10.0+cu128, Datasets 4.0.0, and Tokenizers 0.22.2.

Intended Use Cases

This model is particularly well-suited for applications requiring strong mathematical understanding and problem-solving. While specific use cases are not detailed, its mathematical specialization makes it ideal for tasks such as:

Solving mathematical equations and problems.
Assisting in scientific computations.
Generating mathematical explanations or proofs.
Educational tools for mathematics.