Weyaxi/Einstein-v6.1-Llama3-8B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Apr 19, 2024License:otherArchitecture:Transformer0.1K Warm

Weyaxi/Einstein-v6.1-Llama3-8B is an 8 billion parameter language model, a full fine-tune of Meta's Llama-3-8B architecture. Developed by Weyaxi and sponsored by sablo.ai, it was trained on diverse datasets using the axolotl framework, featuring an 8192-token context length. This model is designed for general conversational AI tasks, leveraging a broad mix of instruction and sharegpt-style data for enhanced performance.

Loading preview...

Einstein-v6.1-Llama3-8B Overview

Weyaxi/Einstein-v6.1-Llama3-8B is an 8 billion parameter language model, meticulously fine-tuned from Meta's Llama-3-8B. This model was developed by Weyaxi, with training sponsored by sablo.ai, utilizing a powerful setup of 8xRTX3090 + 1xRTXA6000 GPUs and the axolotl framework. It features an 8192-token context length and was trained for 2 epochs over 2026 steps.

Key Capabilities

  • Diverse Training Data: Fine-tuned on a wide array of datasets including alpaca, gpteacher, wizardlm, capybara, synthia, cot_alpaca_gpt4, slimorca, airoboros, allenai_wild_chat, pippa_bagel_repo, gpt4_data_lmys, sharegpt_gpt4_english, no_robots, oasst_top1, and everythinglm-data-v3.
  • ChatML Support: Optimized for the ChatML prompt template, enabling structured conversational interactions.
  • Performance Benchmarks: Achieves an average score of 68.60 on the Open LLM Leaderboard (v1) and 19.99 on v2, with notable scores in HellaSwag (82.41) and Winogrande (79.32).

Good For

  • General Conversational AI: Its diverse training makes it suitable for a broad range of dialogue-based applications.
  • Instruction Following: Benefits from instruction-tuned datasets for improved response generation.
  • Research and Development: Provides a strong base for further fine-tuning or integration into larger systems, as demonstrated by its use in models like Octopus-V4-3B.