centroIA/llama2-agile-ia
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kArchitecture:Transformer Cold

The centroIA/llama2-agile-ia model is a Llama 2-based language model developed by centroIA. It was trained using 4-bit quantization with the NF4 quantization type and float16 compute dtype, leveraging PEFT for efficient fine-tuning. This model is characterized by its specific training configuration focused on memory efficiency and performance, making it suitable for applications requiring optimized resource usage.

Loading preview...