Einstein-v6.1-Llama3-8B Overview
Weyaxi/Einstein-v6.1-Llama3-8B is an 8 billion parameter language model, meticulously fine-tuned from Meta's Llama-3-8B. This model was developed by Weyaxi, with training sponsored by sablo.ai, utilizing a powerful setup of 8xRTX3090 + 1xRTXA6000 GPUs and the axolotl framework. It features an 8192-token context length and was trained for 2 epochs over 2026 steps.
Key Capabilities
- Diverse Training Data: Fine-tuned on a wide array of datasets including
alpaca, gpteacher, wizardlm, capybara, synthia, cot_alpaca_gpt4, slimorca, airoboros, allenai_wild_chat, pippa_bagel_repo, gpt4_data_lmys, sharegpt_gpt4_english, no_robots, oasst_top1, and everythinglm-data-v3. - ChatML Support: Optimized for the ChatML prompt template, enabling structured conversational interactions.
- Performance Benchmarks: Achieves an average score of 68.60 on the Open LLM Leaderboard (v1) and 19.99 on v2, with notable scores in HellaSwag (82.41) and Winogrande (79.32).
Good For
- General Conversational AI: Its diverse training makes it suitable for a broad range of dialogue-based applications.
- Instruction Following: Benefits from instruction-tuned datasets for improved response generation.
- Research and Development: Provides a strong base for further fine-tuning or integration into larger systems, as demonstrated by its use in models like Octopus-V4-3B.