amityco/amax-sigma-scratch-sft

Warm
Public
4B
BF16
40960
Jan 8, 2026
License: apache-2.0
Hugging Face
Overview

amityco/amax-sigma-scratch-sft Overview

The amityco/amax-sigma-scratch-sft is a 4 billion parameter language model based on the Qwen3 architecture, developed by amityco. This model was specifically finetuned using the Unsloth library in conjunction with Huggingface's TRL library, which enabled a 2x acceleration in its training process. It supports a substantial context length of 40960 tokens, allowing it to process and generate longer sequences of text.

Key Capabilities

  • Qwen3 Architecture: Leverages the robust Qwen3 base model for strong language understanding and generation.
  • Efficient Finetuning: Benefits from finetuning with Unsloth, resulting in faster training times.
  • Extended Context Window: Features a 40960-token context length, ideal for tasks requiring deep contextual awareness.

Good For

  • Applications needing a 4B parameter model with an extended context window.
  • Projects where efficient finetuning methods are a priority.
  • Tasks that can benefit from the Qwen3 model's capabilities.