simlamkr1/Llama2-simtestmodel14
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kArchitecture:Transformer Cold
Llama2-simtestmodel14 is a 7 billion parameter language model developed by simlamkr1, based on the Llama 2 architecture. This model was trained using 4-bit quantization (nf4) with bitsandbytes, leveraging PEFT for efficient fine-tuning. Its training methodology focuses on optimizing resource usage, making it suitable for environments with constrained computational resources. The model's primary differentiation lies in its efficient quantization-aware training, which allows for deployment in scenarios where memory and processing power are critical factors.
Loading preview...