uukuguy/speechless-thoughts-mistral-7b
uukuguy/speechless-thoughts-mistral-7b is a 7 billion parameter Mistral-based causal language model fine-tuned by uukuguy. It serves as a baseline for the speechless-sparsetral-16x7b-MoE model, focusing on coding, reasoning, and planning tasks. The model was trained on a diverse dataset including filtered categories from jondurbin/airoboros-2.2, Open-Orca, Open-Platypus, WizardLM, and Python-specific datasets. It is optimized for tasks requiring strong logical and programming capabilities.
Loading preview...
Overview
uukuguy/speechless-thoughts-mistral-7b is a 7 billion parameter language model built on the Mistral architecture, developed by uukuguy. It functions as a foundational model for the larger speechless-sparsetral-16x7b-MoE. This model is specifically fine-tuned on a curated dataset totaling 252,000 samples, emphasizing coding, reasoning, and planning.
Key Capabilities & Training
The model's training data includes:
- Coding and Reasoning: Filtered samples from jondurbin/airoboros-2.2 and WizardLM_evol_instruct_V2_196k.
- Instruction Following: Open-Orca's 'cot' category and Open-Platypus.
- Python Specifics: TokenBender/python_eval_instruct_51k and Spider dataset for SQL.
- General Instruction: codefuse-ai/Evol-Instruction-66k.
Performance Highlights
Evaluations on the Open LLM Leaderboard show an average score of 59.72. Notable scores include:
- HellaSwag (10-shot): 80.71
- MMLU (5-shot): 60.11
- Winogrande (5-shot): 77.82
Usage
The model utilizes the Alpaca prompt format for instruction-response interactions. It supports a context length of 8192 tokens, making it suitable for tasks requiring moderate input lengths.