Model Overview
This model, developed by vibhorag101, is a fine-tuned version of the Llama-2-7b-chat-hf architecture, specifically adapted for mental therapy applications. It leverages a 7 billion parameter base model and has been trained on a specialized therapy dataset to provide supportive and cheerful responses.
Key Capabilities
- Basic Mental Therapy Support: Designed to offer initial therapeutic interactions.
- Positive and Cheerful Tone: The model's system prompt encourages helpful, cheerful, and safe responses, avoiding harmful or negative content.
- Conversational AI: Optimized for engaging in dialogue to support mental well-being.
Training Details
The model was fine-tuned using an RTX A5000 GPU with 24GB VRAM. Key hyperparameters included 3 training epochs, a batch size of 2, and a maximum sequence length of 2048. LoRA (Low-Rank Adaptation) was applied with lora_r=64 and lora_alpha=16 for efficient fine-tuning. The training data consisted of 1000 samples, split 80:20 for training and evaluation.
Performance Metrics
Evaluations on the Open LLM Leaderboard show an average score of 42.84. Specific scores include ARC (25-shot) at 52.39, HellaSwag (10-shot) at 75.39, and MMLU (5-shot) at 39.77. While not a general-purpose LLM, these metrics provide a baseline for its foundational language understanding.
Good for
- Applications requiring a supportive and positive conversational agent.
- Initial mental health support systems or chatbots.
- Use cases where a cheerful and empathetic tone is crucial.