WebraftAI/synapsellm-7b-mistral-v0.4-preview2

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Nov 30, 2023License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

WebraftAI/synapsellm-7b-mistral-v0.4-preview2 is a 7 billion parameter decoder-only transformer model, finetuned from Mistral-7b-v0.1 by WebraftAI. It is specifically adapted for chat Q/A and code instructions, utilizing a custom dataset focused on mathematical Q/A, general Q/A, and various code types. This model is designed for robust, generalized information systems, excelling in question-answering and code-related tasks.

Loading preview...

Overview

WebraftAI/synapsellm-7b-mistral-v0.4-preview2 is a 7 billion parameter decoder-only transformer model developed by WebraftAI. It is a finetuned version of Mistral-7b-v0.1, adapted for chat Q/A and code instructions. The finetuning process involved a custom dataset of approximately 770k rows, including 361k Maths Instruct Q/A, 143k GPT-3.5 Q/A, 140k General Code, 63k Python code, and 54k General Q/A (through GPT-4).

Key Capabilities

  • Chat Q/A: Optimized for general question-answering scenarios.
  • Code Instructions: Proficient in handling code-related queries and instructions, including Python.
  • Mathematical Reasoning: Includes specific training data for mathematical instruction Q/A.

Training Details

The model was trained using Qlora adapter with a learning rate of 2e-4, float16 precision, and a batch size of 32. It underwent 150 steps and 1 epoch of training. This is a fully merged model, ready to be loaded via the transformers library.

Performance Metrics

Evaluations on the Open LLM Leaderboard show an average score of 55.93. Notable scores include 74.54 on HellaSwag (10-Shot) and 73.95 on Winogrande (5-shot), indicating strong performance in common sense reasoning. MMLU (5-Shot) scored 54.60, and GSM8k (5-shot) scored 25.70.

Limitations

  • May produce factually incorrect information.
  • Does not follow system prompts.
  • Lacks memory capabilities.
  • Potential for bias due to training data, including self-identification as a GPT model.