Model Overview
Neelectric/Llama-3.1-8B-Instruct_SFT_sciencefisher_v00.02 is an 8 billion parameter instruction-tuned model, building upon the robust foundation of Meta's Llama-3.1-8B-Instruct. This version has undergone Supervised Fine-Tuning (SFT) using the Neelectric/MoT_science_Llama3_4096toks dataset, specifically curated for scientific content.
Key Capabilities
- Scientific Domain Specialization: Fine-tuned on a dedicated scientific dataset, enhancing its ability to understand and generate content related to scientific topics.
- Instruction Following: Inherits strong instruction-following capabilities from its base Llama-3.1-8B-Instruct model.
- Extended Context Window: Supports a context length of 32768 tokens, allowing for processing and generating longer, more complex scientific texts.
Training Details
The model was trained using the TRL library, a framework for Transformer Reinforcement Learning. The training process leveraged specific versions of key libraries:
- TRL: 0.28.0.dev0
- Transformers: 4.57.6
- Pytorch: 2.9.0
- Datasets: 4.5.0
- Tokenizers: 0.22.2
Use Cases
This model is particularly well-suited for applications requiring a strong understanding and generation of scientific text, such as:
- Answering scientific questions.
- Summarizing research papers.
- Generating scientific explanations or discussions.