jcmei/SELM-Llama-3-8B-Instruct-iter-1

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kLicense:llama3Architecture:Transformer Warm

The jcmei/SELM-Llama-3-8B-Instruct-iter-1 is an 8 billion parameter instruction-tuned causal language model, fine-tuned by jcmei, based on Meta's Llama-3-8B-Instruct architecture. This model leverages an 8192-token context length and is optimized through a single iteration of fine-tuning on updated and original datasets. It is designed for general instruction-following tasks, building upon the strong base capabilities of the Llama 3 series.

Loading preview...

SELM-Llama-3-8B-Instruct-iter-1: Overview

This model, developed by jcmei, is an instruction-tuned variant of the powerful Meta-Llama-3-8B-Instruct base model. It features 8 billion parameters and supports an 8192-token context window, making it suitable for a wide range of natural language processing tasks requiring understanding and generation.

Key Characteristics

  • Base Model: Built upon meta-llama/Meta-Llama-3-8B-Instruct, inheriting its robust architecture and pre-training.
  • Fine-tuning: Underwent a single iteration of fine-tuning (iter-1) using both updated and original datasets to enhance instruction-following capabilities.
  • Training Configuration: Trained with a learning rate of 5e-07, a total batch size of 256 (across 16 devices), and a cosine learning rate scheduler with a 0.1 warmup ratio over 1 epoch.

Intended Use Cases

Given its instruction-tuned nature and Llama 3 foundation, this model is generally well-suited for:

  • General-purpose conversational AI: Engaging in dialogue and answering questions.
  • Text generation: Creating coherent and contextually relevant text based on prompts.
  • Instruction following: Executing commands and fulfilling requests specified in natural language.

Further details on specific intended uses and limitations would require more information from the developer.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p