artificialguybr/Meta-Llama-3.1-8B-openhermes-2.5
artificialguybr/Meta-Llama-3.1-8B-openhermes-2.5 is a fine-tuned 8 billion parameter causal language model developed by artificialguybr. It is based on Meta-Llama-3.1-8B and trained on the OpenHermes-2.5 dataset. This model is optimized for instruction following and general language tasks, making it suitable for text generation and question answering applications. It utilizes a LlamaForCausalLM architecture with a hidden size of 4,096 and a vocabulary of 128,256.
Loading preview...
Overview
artificialguybr/Meta-Llama-3.1-8B-openhermes-2.5 is an 8 billion parameter causal language model, fine-tuned by artificialguybr from the Meta-Llama-3.1-8B base model. It leverages the OpenHermes-2.5 dataset for its instruction-following capabilities. The model was trained using BF16 mixed precision with an AdamW optimizer over 13,368 steps on a single NVIDIA A100-SXM4-80GB GPU.
Key Capabilities
- Instruction Following: Designed to accurately follow given instructions for various language tasks.
- General Language Understanding: Proficient in understanding and generating human-like text.
- Text Generation: Capable of producing coherent and contextually relevant text.
- Question Answering: Suitable for extracting answers from provided contexts or general knowledge.
Training Details
- Base Model: Meta-Llama-3.1-8B
- Fine-tuning Dataset: teknium/OpenHermes-2.5
- Optimizer: AdamW with a decaying learning rate starting at 0.00000249.
- Hardware: NVIDIA A100-SXM4-80GB GPU.
- Evaluation Loss: Achieved an evaluation loss of 0.6727.
Good For
- Applications requiring robust instruction following.
- General-purpose text generation tasks.
- Question answering systems.
- Developers looking for a Llama-3.1-8B variant optimized with a high-quality instruction dataset.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.