Einstein-v6-7B: A Fine-Tuned Mistral Model
Einstein-v6-7B is a 7 billion parameter language model developed by Weyaxi, built upon the robust alpindale/Mistral-7B-v0.2-hf architecture. This model underwent a full fine-tuning process over two epochs, utilizing a diverse array of instruction and chat-based datasets, including merged_all.json, gpteacher-instruct-special-alpaca.json, wizardlm_evol_instruct_70k_random_half.json, and various ShareGPT datasets like capybara_sharegpt.json and slimorca_dedup_filtered_95k_sharegpt.json.
Key Capabilities & Performance
This model is designed for general-purpose conversational AI, demonstrating solid performance on the Open LLM Leaderboard with an average score of 67.12. Notable benchmark results include:
- AI2 Reasoning Challenge (25-Shot): 63.57
- HellaSwag (10-Shot): 82.76
- MMLU (5-Shot): 62.23
- GSM8k (5-shot): 63.53
It supports a context length of 4096 tokens and is configured to use a ChatML prompt template, facilitating structured interactions via tokenizer.apply_chat_template().
Unique Aspects & Usage
Einstein-v6-7B was trained using axolotl on a powerful GPU setup, with sponsorship from sablo.ai. The model is available in various quantized versions, including GGUF, ExLlamaV2, and AWQ, provided by community contributors like @bartowski and @solidrust. Its training on a broad spectrum of conversational data makes it suitable for applications requiring robust instruction following and general dialogue capabilities.