uukuguy/speechless-thoughts-mistral-7b

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Feb 13, 2024License:llama2Architecture:Transformer0.0K Open Weights Cold

uukuguy/speechless-thoughts-mistral-7b is a 7 billion parameter Mistral-based causal language model fine-tuned by uukuguy. It serves as a baseline for the speechless-sparsetral-16x7b-MoE model, focusing on coding, reasoning, and planning tasks. The model was trained on a diverse dataset including filtered categories from jondurbin/airoboros-2.2, Open-Orca, Open-Platypus, WizardLM, and Python-specific datasets. It is optimized for tasks requiring strong logical and programming capabilities.

Loading preview...

Overview

uukuguy/speechless-thoughts-mistral-7b is a 7 billion parameter language model built on the Mistral architecture, developed by uukuguy. It functions as a foundational model for the larger speechless-sparsetral-16x7b-MoE. This model is specifically fine-tuned on a curated dataset totaling 252,000 samples, emphasizing coding, reasoning, and planning.

Key Capabilities & Training

The model's training data includes:

  • Coding and Reasoning: Filtered samples from jondurbin/airoboros-2.2 and WizardLM_evol_instruct_V2_196k.
  • Instruction Following: Open-Orca's 'cot' category and Open-Platypus.
  • Python Specifics: TokenBender/python_eval_instruct_51k and Spider dataset for SQL.
  • General Instruction: codefuse-ai/Evol-Instruction-66k.

Performance Highlights

Evaluations on the Open LLM Leaderboard show an average score of 59.72. Notable scores include:

  • HellaSwag (10-shot): 80.71
  • MMLU (5-shot): 60.11
  • Winogrande (5-shot): 77.82

Usage

The model utilizes the Alpaca prompt format for instruction-response interactions. It supports a context length of 8192 tokens, making it suitable for tasks requiring moderate input lengths.