winglian/qwen3-14b-math

TEXT GENERATIONConcurrency Cost:1Model Size:14BQuant:FP8Ctx Length:32kPublished:May 26, 2025License:apache-2.0Architecture:Transformer Open Weights Cold

The winglian/qwen3-14b-math model is a 14 billion parameter language model developed by Qwen, fine-tuned from Qwen/Qwen3-14B-Base. It is specifically optimized for mathematical reasoning and problem-solving tasks, leveraging the winglian/OpenThoughts-114k-math-correct dataset. This model is designed to enhance performance in complex mathematical contexts, making it suitable for applications requiring precise numerical and logical operations.

Loading preview...

Model Overview

The winglian/qwen3-14b-math is a 14 billion parameter language model, fine-tuned from the base model Qwen/Qwen3-14B-Base. This model has been specialized for mathematical tasks through training on the winglian/OpenThoughts-114k-math-correct dataset, which focuses on mathematical reasoning and problem-solving.

Key Training Details

  • Base Model: Qwen/Qwen3-14B-Base
  • Dataset: winglian/OpenThoughts-114k-math-correct (chat template, split thinking enabled)
  • Training Framework: Built with Axolotl (version 0.10.0.dev0)
  • Sequence Length: 8192 tokens, utilizing sample packing and padding.
  • Optimization: Trained with adamw_torch_fused optimizer, a learning rate of 1e-5, and rex scheduler over 2 epochs.
  • Hardware: Distributed training across 8 GPUs with a total batch size of 32.
  • Performance: Achieved a validation loss of 0.3439, indicating effective learning on the mathematical dataset.

Intended Use Cases

This model is particularly well-suited for applications requiring strong mathematical capabilities, such as:

  • Solving complex mathematical problems.
  • Generating explanations for mathematical concepts.
  • Assisting in educational tools for math.
  • Developing agents that require numerical reasoning.