Overview
WebraftAI/synapsellm-7b-mistral-v0.4-preview2 is a 7 billion parameter decoder-only transformer model developed by WebraftAI. It is a finetuned version of Mistral-7b-v0.1, adapted for chat Q/A and code instructions. The finetuning process involved a custom dataset of approximately 770k rows, including 361k Maths Instruct Q/A, 143k GPT-3.5 Q/A, 140k General Code, 63k Python code, and 54k General Q/A (through GPT-4).
Key Capabilities
- Chat Q/A: Optimized for general question-answering scenarios.
- Code Instructions: Proficient in handling code-related queries and instructions, including Python.
- Mathematical Reasoning: Includes specific training data for mathematical instruction Q/A.
Training Details
The model was trained using Qlora adapter with a learning rate of 2e-4, float16 precision, and a batch size of 32. It underwent 150 steps and 1 epoch of training. This is a fully merged model, ready to be loaded via the transformers library.
Performance Metrics
Evaluations on the Open LLM Leaderboard show an average score of 55.93. Notable scores include 74.54 on HellaSwag (10-Shot) and 73.95 on Winogrande (5-shot), indicating strong performance in common sense reasoning. MMLU (5-Shot) scored 54.60, and GSM8k (5-shot) scored 25.70.
Limitations
- May produce factually incorrect information.
- Does not follow system prompts.
- Lacks memory capabilities.
- Potential for bias due to training data, including self-identification as a GPT model.