Model Overview
Neelectric/Llama-3.1-8B-Instruct_SFT_sciencefisher_v00.11 is an 8 billion parameter instruction-tuned model, fine-tuned by Neelectric from the base meta-llama/Llama-3.1-8B-Instruct architecture. This model has been specifically adapted for scientific applications through supervised fine-tuning (SFT).
Key Capabilities
- Scientific Domain Specialization: The model is fine-tuned on the
Neelectric/MoT_science_Llama3_4096toks dataset, indicating a focus on scientific text and knowledge. - Instruction Following: As an instruction-tuned model, it is designed to follow user prompts and generate relevant responses.
- Extended Context Window: It supports a context length of 32768 tokens, allowing for processing and generating longer, more complex scientific discussions or documents.
Training Details
The model was trained using the TRL library for supervised fine-tuning. The training process utilized specific versions of frameworks including TRL 1.0.0.dev0, Transformers 4.57.6, Pytorch 2.9.0, Datasets 4.8.3, and Tokenizers 0.22.2. Further details on the training run can be found on Weights & Biases.
Recommended Use Cases
This model is suitable for applications requiring an understanding and generation of scientific content, such as:
- Answering scientific questions.
- Summarizing scientific articles or research papers.
- Generating text for scientific discussions or educational materials.