SicariusSicariiStuff/Impish_LLAMA_3B
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Oct 1, 2024License:llama3.2Architecture:Transformer0.0K Warm

Impish_LLAMA_3B by SicariusSicariiStuff is a 3.2 billion parameter Llama-3-Instruct based model with a 32768 token context length, specifically fine-tuned for role-play and general tasks. It features a unique multi-phase training approach designed to reduce "GPTisms" and enhance coherence. This model is noted for its medium-low censorship level and its ability to sometimes perform beyond its size class.

Loading preview...

Impish_LLAMA_3B: A Role-Play Focused Llama-3 Model

Impish_LLAMA_3B, developed by SicariusSicariiStuff, is a 3.2 billion parameter model built on the Llama-3-Instruct architecture, featuring a substantial 32768 token context length. This model was specifically designed with a primary focus on role-play and general conversational tasks.

Key Capabilities & Training

What sets Impish_LLAMA_3B apart is its distinctive three-phase training methodology:

  • Phase 1 (FFT): An initial, extensive phase aimed at introducing new knowledge and deliberately "confusing" the model to reduce common "GPTisms" in its output.
  • Phase 2 (Deep QLORA R=512): A subsequent phase using a deep QLORA with R=512 on a new dataset to "unconfuse" the model and restore coherence, avoiding overfitting.
  • Phase 3 (QLORA R=128): A final QLORA phase with R=128, again on a different dataset, to refine and connect learned concepts into a coherent whole.

This unique training process has resulted in a model that, despite its relatively small size, can sometimes produce surprisingly sophisticated outputs, occasionally being mistaken for much larger models. It offers a medium-low censorship level (rated 5.5/10, where 10 is completely uncensored).

Recommended Usage

Impish_LLAMA_3B is intended for role-play and general tasks. It is highly recommended to use the SICAtxt format for role-play and adventure scenarios, which is a modified CAI-style format designed for efficient character card creation with minimal tokens. The model utilizes the standard Llama-3-Instruct instruction template. Various quantizations are available, including GGUF, EXL2, GPTQ, FP8, and ARM-optimized versions.