Impish_LLAMA_3B by SicariusSicariiStuff is a 3.2 billion parameter Llama-3-Instruct based model with a 32768 token context length, specifically fine-tuned for role-play and general tasks. It features a unique multi-phase training approach designed to reduce "GPTisms" and enhance coherence. This model is noted for its medium-low censorship level and its ability to sometimes perform beyond its size class.
Loading preview...
Impish_LLAMA_3B: A Role-Play Focused Llama-3 Model
Impish_LLAMA_3B, developed by SicariusSicariiStuff, is a 3.2 billion parameter model built on the Llama-3-Instruct architecture, featuring a substantial 32768 token context length. This model was specifically designed with a primary focus on role-play and general conversational tasks.
Key Capabilities & Training
What sets Impish_LLAMA_3B apart is its distinctive three-phase training methodology:
- Phase 1 (FFT): An initial, extensive phase aimed at introducing new knowledge and deliberately "confusing" the model to reduce common "GPTisms" in its output.
- Phase 2 (Deep QLORA R=512): A subsequent phase using a deep QLORA with R=512 on a new dataset to "unconfuse" the model and restore coherence, avoiding overfitting.
- Phase 3 (QLORA R=128): A final QLORA phase with R=128, again on a different dataset, to refine and connect learned concepts into a coherent whole.
This unique training process has resulted in a model that, despite its relatively small size, can sometimes produce surprisingly sophisticated outputs, occasionally being mistaken for much larger models. It offers a medium-low censorship level (rated 5.5/10, where 10 is completely uncensored).
Recommended Usage
Impish_LLAMA_3B is intended for role-play and general tasks. It is highly recommended to use the SICAtxt format for role-play and adventure scenarios, which is a modified CAI-style format designed for efficient character card creation with minimal tokens. The model utilizes the standard Llama-3-Instruct instruction template. Various quantizations are available, including GGUF, EXL2, GPTQ, FP8, and ARM-optimized versions.