Overview
Weyaxi/Einstein-v5-v0.2-7B is a 7 billion parameter language model, fine-tuned from the alpindale/Mistral-7B-v0.2-hf base model. It was trained using axolotl on a diverse collection of datasets, including various ShareGPT and Alpaca-style conversational data, to enhance its instruction-following and chat capabilities. This version is noted to have a sliding window error in its base model configuration, which has since been resolved in the upstream base model.
Key Capabilities
- General-purpose conversational AI: Fine-tuned on a wide array of chat and instruction datasets, making it suitable for diverse dialogue tasks.
- Instruction Following: Designed to respond effectively to user prompts and instructions.
- Reasoning and Language Understanding: Achieves competitive scores on benchmarks such as AI2 Reasoning Challenge (60.92), MMLU (61.02), and HellaSwag (80.99).
Performance Metrics
On the Open LLM Leaderboard, Einstein-v5-v0.2-7B achieved an average score of 65.65. Specific benchmark results include:
- Avg.: 65.65
- AI2 Reasoning Challenge (25-Shot): 60.92
- HellaSwag (10-Shot): 80.99
- MMLU (5-Shot): 61.02
- TruthfulQA (0-shot): 52.59
- Winogrande (5-shot): 78.69
- GSM8k (5-shot): 59.67
Usage and Prompting
The model utilizes a ChatML prompt template, which can be applied using tokenizer.apply_chat_template() for structured conversations. Quantized versions (GGUF and ExLlamaV2) are also available for optimized inference.