ZhangShenao/SELM-Llama-3-8B-Instruct-iter-3

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kLicense:mitArchitecture:Transformer0.0K Open Weights Cold

SELM-Llama-3-8B-Instruct-iter-3 is an 8 billion parameter Llama3-instruct-based Self-Exploring Language Model (SELM) developed by ZhangShenao. This model is the third iteration, fine-tuned using synthetic data derived from the HuggingFaceH4/ultrafeedback_binarized dataset. It demonstrates improved performance on benchmarks like MT-Bench compared to its predecessors and the base Llama-3-8B-Instruct model, making it suitable for instruction-following tasks.

Loading preview...

SELM-Llama-3-8B-Instruct-iter-3 Overview

This model, developed by ZhangShenao, is the third iteration of a Self-Exploring Language Model (SELM) based on the 8 billion parameter Llama-3-Instruct architecture. It is fine-tuned using synthetic data generated from the HuggingFaceH4/ultrafeedback_binarized dataset, building upon its predecessor, SELM-Llama-3-8B-Instruct-iter-2.

Key Capabilities & Performance

SELM-Llama-3-8B-Instruct-iter-3 shows enhanced performance in instruction-following and general conversational abilities. Notable results include:

  • MT-Bench (Average): Achieves 8.29, surpassing SELM-Llama-3-8B-Instruct-iter-2 (8.09) and the base Meta-Llama-3-8B-Instruct (7.93).
  • AlpacaEval 2.0 (LC WR): Scores 33.47.
  • Ranks highly on the WildBench leaderboard.

This model is part of a research effort on "Self-Exploring Language Models: Active Preference Elicitation for Online Alignment," indicating its development focuses on advanced alignment techniques. It is licensed under MIT.

When to Use This Model

  • Instruction Following: Ideal for applications requiring robust responses to user instructions.
  • General Conversational AI: Suitable for chatbots and interactive agents where strong dialogue capabilities are needed.
  • Research in Alignment: Useful for exploring models developed with active preference elicitation and online alignment strategies.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p