alibidaran/Zigroo-Mental_consultant2-merged
alibidaran/Zigroo-Mental_consultant2-merged is an 8 billion parameter causal language model, built on Qwen3-8B, specifically fine-tuned for empathetic and therapeutically-informed conversational support. It underwent Supervised Fine-Tuning (SFT) on five mental health datasets and Direct Preference Optimization (DPO) to align responses for reliability and compassion. This model excels at generating dialogues for mental health research, prototyping support chatbots, and studying multi-stage fine-tuning in sensitive domains.
Loading preview...
Zigroo Mental Consultant 2 (Merged)
This model, developed by alibidaran, is an 8 billion parameter causal language model based on unsloth/Qwen3-8B-unsloth-bnb-4bit. It is specifically designed to provide empathetic and therapeutically-informed conversational support, making it distinct from general-purpose LLMs.
Key Capabilities & Training
The model's unique capabilities stem from its two-stage training pipeline:
- Supervised Fine-Tuning (SFT): Utilized a LoRA adapter (rank 32) across five curated mental health and therapy datasets, including conversational therapy dialogues, Acceptance and Commitment Therapy (ACT) scripts, motivational interviewing, general mental health therapy, and student counseling conversations.
- Direct Preference Optimization (DPO): Further aligned the model's outputs using a psychology-grounded preference dataset to enhance reliability, empathy, and therapeutic grounding in its responses.
Intended Use Cases
This model is particularly suited for:
- Research into LLM-based therapeutic conversational agents.
- Prototyping mental health support chatbots.
- Studying multi-stage fine-tuning (SFT + DPO) pipelines for sensitive domains.
- Educational exploration of therapeutic dialogue generation.
Disclaimer: This model is for research and educational purposes only and is not a substitute for professional mental health care. It is not equipped for clinical diagnosis, treatment, or crisis intervention without human oversight.