Weyaxi/Einstein-v6-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 3, 2024License:otherArchitecture:Transformer0.0K Cold

Weyaxi/Einstein-v6-7B is a 7 billion parameter causal language model, fine-tuned from alpindale/Mistral-7B-v0.2-hf on a diverse collection of instruction and chat datasets. This model is optimized for general conversational AI tasks, demonstrating strong performance across various reasoning and language understanding benchmarks. It features a 4096-token context length and utilizes a ChatML prompt template for structured interactions.

Loading preview...

Einstein-v6-7B: A Fine-Tuned Mistral Model

Einstein-v6-7B is a 7 billion parameter language model developed by Weyaxi, built upon the robust alpindale/Mistral-7B-v0.2-hf architecture. This model underwent a full fine-tuning process over two epochs, utilizing a diverse array of instruction and chat-based datasets, including merged_all.json, gpteacher-instruct-special-alpaca.json, wizardlm_evol_instruct_70k_random_half.json, and various ShareGPT datasets like capybara_sharegpt.json and slimorca_dedup_filtered_95k_sharegpt.json.

Key Capabilities & Performance

This model is designed for general-purpose conversational AI, demonstrating solid performance on the Open LLM Leaderboard with an average score of 67.12. Notable benchmark results include:

  • AI2 Reasoning Challenge (25-Shot): 63.57
  • HellaSwag (10-Shot): 82.76
  • MMLU (5-Shot): 62.23
  • GSM8k (5-shot): 63.53

It supports a context length of 4096 tokens and is configured to use a ChatML prompt template, facilitating structured interactions via tokenizer.apply_chat_template().

Unique Aspects & Usage

Einstein-v6-7B was trained using axolotl on a powerful GPU setup, with sponsorship from sablo.ai. The model is available in various quantized versions, including GGUF, ExLlamaV2, and AWQ, provided by community contributors like @bartowski and @solidrust. Its training on a broad spectrum of conversational data makes it suitable for applications requiring robust instruction following and general dialogue capabilities.