Model Overview
Neelectric/Llama-3.1-8B-Instruct_SFT_MoTv00.02 is an 8 billion parameter instruction-tuned language model, building upon Meta's Llama-3.1-8B-Instruct architecture. It has been fine-tuned by Neelectric using the TRL framework, specifically on the Neelectric/MoT_all_Llama3_8192toks dataset. This specialized training aims to enhance its performance in instruction-following and conversational AI scenarios.
Key Capabilities
- Instruction Following: Designed to accurately interpret and execute user instructions.
- Conversational AI: Optimized for engaging in natural and coherent dialogues.
- Extended Context: Benefits from the base model's 32,768 token context window, allowing for processing longer inputs and maintaining context over extended interactions.
- Fine-tuned Performance: Leverages Supervised Fine-Tuning (SFT) on a curated dataset to improve response quality and relevance.
Training Details
The model was trained using the TRL (Transformers Reinforcement Learning) library, with specific framework versions including TRL 0.28.0.dev0, Transformers 4.57.6, Pytorch 2.9.0, Datasets 4.5.0, and Tokenizers 0.22.2. The training process is publicly logged and can be visualized via Weights & Biases here.
Good For
- Applications requiring robust instruction adherence.
- Building chatbots or virtual assistants that need to maintain long-term context.
- Generating detailed and contextually relevant text based on user prompts.