namezz/lvm-math-0408-a-qwen3-30b-a3b-instruct-b-qwen3-1.7b-base

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Apr 8, 2026License:otherArchitecture:Transformer Warm

The namezz/lvm-math-0408-a-qwen3-30b-a3b-instruct-b-qwen3-1.7b-base model is a 1.7 billion parameter Qwen3-based language model fine-tuned by namezz. This model is specifically optimized for mathematical tasks, demonstrating improved performance on various mathematical error metrics such as Token Mean Mae, Rmse, and Relerr. It is designed for applications requiring precise numerical reasoning and mathematical problem-solving capabilities.

Loading preview...

Model Overview

This model, developed by namezz, is a fine-tuned variant of the Qwen/Qwen3-1.7B-Base architecture, featuring 1.7 billion parameters and a 32768 token context length. It has undergone specialized training on the 30b_a3b_math_95k_16_train dataset, indicating a strong focus on mathematical reasoning and problem-solving.

Key Capabilities

  • Mathematical Optimization: The model shows significant fine-tuning for mathematical tasks, as evidenced by its evaluation metrics including Token Mean Mae (25192073741.5337), Token Mean Rmse (7814890.5454), and Token Mean Relerr (0.6543).
  • Performance Metrics: Training results highlight consistent improvements in validation loss and various mathematical error metrics over 2 epochs, with a final loss of 0.0101.
  • Training Configuration: Utilizes a multi-GPU setup with AdamW optimizer, cosine learning rate scheduler, and a total batch size of 1024, suggesting robust training for specialized tasks.

Good for

  • Mathematical Problem Solving: Ideal for applications requiring accurate numerical computations and mathematical reasoning.
  • Research in Mathematical LLMs: Useful for researchers exploring the fine-tuning of base models for domain-specific mathematical challenges.