Atikarahmanda/Qwen3-4B-SFT-Multitask
Atikarahmanda/Qwen3-4B-SFT-Multitask is a 4 billion parameter Qwen3-based language model fine-tuned by Atikarahmanda for various Natural Language Processing tasks specifically in Bahasa Indonesia. This model specializes in sentiment analysis, justifier classification, and issue categorization, making it suitable for Indonesian-language text classification. It was fine-tuned using LoRA and has its adapter merged into the base weights, supporting a maximum sequence length of 2048 tokens.
Loading preview...
Model Overview
This model, Atikarahmanda/Qwen3-4B-SFT-Multitask, is a 4 billion parameter Qwen3-based language model developed by Atikarahmanda. It has been fine-tuned from aitf-kpm-ugm/Qwen3-4B-CPT-Base using LoRA (with r=64, alpha=128) and the adapter weights are merged into the base model. Its primary focus is on various NLP tasks specifically for Bahasa Indonesia.
Key Capabilities
- Sentiment Analysis: Classifies text into positive, neutral, or negative sentiment categories.
- Justifier Classification: Determines if content is relevant (true/false).
- Issue Categorization: Classifies text into specific sub-category labels.
Technical Details
- Base Model:
aitf-kpm-ugm/Qwen3-4B-CPT-Base - Output Language: Bahasa Indonesia
- Chat Format: Alpaca
- Precision: bfloat16
- Max Sequence Length: 2048 tokens
- Training: 3 epochs, achieving a best validation loss of 0.3566 at step 2400.
Training Framework
The model was trained using Unsloth + TRL SFTTrainer with an AdamW 8-bit optimizer and a Cosine LR scheduler. It utilized an effective batch size of 48 and was configured to train on responses only.
Licensing
This model follows the Apache 2.0 license of its base model.