NaruseShiroha/capybara-math-smol-WebCoder

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Sep 14, 2025License:apache-2.0Architecture:Transformer Open Weights Cold

NaruseShiroha/capybara-math-smol-WebCoder is a 4 billion parameter Qwen3-based instruction-tuned causal language model developed by NaruseShiroha. This model was finetuned using Unsloth and Huggingface's TRL library, offering a specialized focus. With a 32768 token context length, it is optimized for specific tasks, leveraging its efficient training methodology.

Loading preview...

Model Overview

NaruseShiroha/capybara-math-smol-WebCoder is a 4 billion parameter language model, finetuned by NaruseShiroha. It is based on the Qwen3 architecture and was developed using efficient training techniques from Unsloth and Huggingface's TRL library, enabling 2x faster training.

Key Characteristics

  • Base Model: Finetuned from unsloth/Qwen3-4B-Instruct-2507.
  • Parameter Count: 4 billion parameters.
  • Context Length: Supports a substantial context window of 32768 tokens.
  • Training Efficiency: Utilizes Unsloth for accelerated training.

Potential Use Cases

This model is suitable for applications requiring a compact yet capable Qwen3-based model, especially where efficient training and deployment are priorities. Its specific finetuning suggests potential strengths in areas related to its training data, though further evaluation would be needed to pinpoint exact optimal use cases.