ipswy/senti-shujaa
ipswy/senti-shujaa is a 1.5 billion parameter causal language model, fine-tuned from unsloth/Qwen2.5-1.5B-Instruct-bnb-4bit. Trained using the TRL library with SFT, it offers a 32768-token context length. This model is designed for general text generation tasks, leveraging its instruction-tuned base for diverse conversational and creative applications.
Loading preview...
Model Overview
ipswy/senti-shujaa is a 1.5 billion parameter language model, fine-tuned from the unsloth/Qwen2.5-1.5B-Instruct-bnb-4bit base model. It was developed using the TRL library with Supervised Fine-Tuning (SFT) to enhance its instruction-following capabilities. The model supports a substantial context length of 32768 tokens, making it suitable for processing longer inputs and generating more coherent, extended responses.
Key Capabilities
- Instruction Following: Benefits from its instruction-tuned base, enabling it to respond effectively to a variety of prompts and commands.
- Text Generation: Capable of generating human-like text for diverse applications, from creative writing to question answering.
- Efficient Deployment: Built upon a
bnb-4bitquantized base, suggesting potential for more memory-efficient inference.
Training Details
The model was trained using the SFT method, leveraging specific versions of popular machine learning frameworks:
- PEFT: 0.18.1
- TRL: 0.24.0
- Transformers: 5.5.0
- Pytorch: 2.10.0+cu128
- Datasets: 4.3.0
- Tokenizers: 0.22.2
Usage
Developers can quickly integrate senti_shujaa_v4 into their projects using the Hugging Face transformers library, as demonstrated in the quick start example provided in the original model card.