EdgerunnersArchive/Llama-3-8B-Instruct-ortho-baukit-toxic-v2

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:May 6, 2024License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Warm

EdgerunnersArchive/Llama-3-8B-Instruct-ortho-baukit-toxic-v2 is an 8 billion parameter Llama 3 instruction-tuned model developed by EdgerunnersArchive. It incorporates a Baukit implementation of a refusal mediation theory, specifically designed for alignment research. This model's primary differentiator is its focus on exploring refusal mechanisms in large language models, making it suitable for academic and theoretical investigations into AI alignment rather than general-purpose applications.

Loading preview...

Overview

EdgerunnersArchive/Llama-3-8B-Instruct-ortho-baukit-toxic-v2 is an 8 billion parameter instruction-tuned model based on the Llama 3 architecture. Developed by EdgerunnersArchive, this model integrates a Baukit implementation of a specific theory regarding refusal mediation in large language models. Its core purpose is to facilitate alignment research and the exploration of theories discussed on platforms like Alignment Forum.

Key Capabilities

  • Alignment Research: Specifically designed for investigating refusal mechanisms in LLMs.
  • Theoretical Exploration: Enables testing and exploration of alignment theories, particularly those related to how models generate refusals.
  • Experimental Platform: Provides a modified Llama 3 base for controlled experiments in AI safety and ethics.

Good for

  • Academic Research: Ideal for researchers and academics studying AI alignment, safety, and ethical AI.
  • Theory Testing: Useful for those looking to validate or explore hypotheses about LLM behavior, especially concerning content refusal.
  • Controlled Experiments: Suitable for environments where the goal is to understand and manipulate specific internal mechanisms of LLMs related to undesirable outputs or refusals. Early testing indicates that refusals are still present, suggesting ongoing research potential.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p