StavanKhobare/SST-MetaxPyTorch-Hackathon-Merged16bit
StavanKhobare/SST-MetaxPyTorch-Hackathon-Merged16bit is a 2 billion parameter Qwen3-based causal language model developed by StavanKhobare, fine-tuned from unsloth/qwen3-1.7b-unsloth-bnb-4bit. This model was trained using Unsloth and Huggingface's TRL library, achieving 2x faster training speeds. With a 32768 token context length, it is suitable for applications requiring efficient processing of longer sequences.
Loading preview...
Overview
StavanKhobare/SST-MetaxPyTorch-Hackathon-Merged16bit is a 2 billion parameter Qwen3-based language model developed by StavanKhobare. It was fine-tuned from the unsloth/qwen3-1.7b-unsloth-bnb-4bit base model, leveraging the Unsloth library and Huggingface's TRL for accelerated training. This approach enabled the model to be trained 2x faster than conventional methods.
Key Capabilities
- Efficient Training: Utilizes Unsloth for significantly faster fine-tuning.
- Qwen3 Architecture: Benefits from the robust Qwen3 base model.
- Extended Context: Supports a context length of 32768 tokens, suitable for processing longer inputs.
Good for
- Developers seeking a Qwen3-based model with optimized training origins.
- Applications requiring a 2B parameter model capable of handling substantial context.
- Experimentation with models fine-tuned using Unsloth's accelerated training techniques.