jcmei/SELM-Llama-3-8B-Instruct-iter-1
The jcmei/SELM-Llama-3-8B-Instruct-iter-1 is an 8 billion parameter instruction-tuned causal language model, fine-tuned by jcmei, based on Meta's Llama-3-8B-Instruct architecture. This model leverages an 8192-token context length and is optimized through a single iteration of fine-tuning on updated and original datasets. It is designed for general instruction-following tasks, building upon the strong base capabilities of the Llama 3 series.
Loading preview...
SELM-Llama-3-8B-Instruct-iter-1: Overview
This model, developed by jcmei, is an instruction-tuned variant of the powerful Meta-Llama-3-8B-Instruct base model. It features 8 billion parameters and supports an 8192-token context window, making it suitable for a wide range of natural language processing tasks requiring understanding and generation.
Key Characteristics
- Base Model: Built upon
meta-llama/Meta-Llama-3-8B-Instruct, inheriting its robust architecture and pre-training. - Fine-tuning: Underwent a single iteration of fine-tuning (
iter-1) using both updated and original datasets to enhance instruction-following capabilities. - Training Configuration: Trained with a learning rate of 5e-07, a total batch size of 256 (across 16 devices), and a cosine learning rate scheduler with a 0.1 warmup ratio over 1 epoch.
Intended Use Cases
Given its instruction-tuned nature and Llama 3 foundation, this model is generally well-suited for:
- General-purpose conversational AI: Engaging in dialogue and answering questions.
- Text generation: Creating coherent and contextually relevant text based on prompts.
- Instruction following: Executing commands and fulfilling requests specified in natural language.
Further details on specific intended uses and limitations would require more information from the developer.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.