radm/prophet-qwen3-4b-sft
The radm/prophet-qwen3-4b-sft is a 4 billion parameter causal language model, fine-tuned by radm from Qwen/Qwen3-4B, which is based on the Llama3 architecture. This multilingual model specializes in philosophical and esoteric topics, utilizing Supervised Fine-Tuning (SFT) on a custom reasoning and non-reasoning dataset. It is designed for generating content related to these specific domains, with a unique feature to toggle its reasoning mode.
Loading preview...
Model Overview
The radm/prophet-qwen3-4b-sft is a 4 billion parameter causal language model developed by radm. It is a fine-tuned version of Qwen/Qwen3-4B, inheriting its Llama3-based architecture and multilingual capabilities. The model was trained using Supervised Fine-Tuning (SFT) with the Unsloth library on a custom dataset focusing on reasoning and non-reasoning tasks.
Key Capabilities
- Specialized Content Generation: Excels in generating text related to philosophical and esoteric topics.
- Multilingual Support: Inherits multilingual capabilities from its base model.
- Toggleable Reasoning Mode: Users can switch the model's thinking mode by adding
\n/no_thinkto prompts or system messages.
Training Details
The model was fine-tuned for 1 epoch using Unsloth and trl's SFTTrainer. It utilized a 4-bit quantized base model, Paged AdamW 8-bit optimizer, and a LoRA configuration with r=768 and lora_alpha=768 targeting key attention and feed-forward modules. The maximum sequence length during training was 4096 tokens.
Intended Use Cases
This model is primarily intended for applications requiring generation or analysis of philosophical and esoteric content. Its unique reasoning toggle allows for flexible interaction styles. Users should be aware of potential biases inherited from the base model and training data, and critically evaluate outputs, especially for factual accuracy.