WesleySantos/mh_qa
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kArchitecture:Transformer Cold

WesleySantos/mh_qa is a language model fine-tuned using bitsandbytes 4-bit quantization. This model was trained with PEFT 0.6.0.dev0, utilizing fp4 quantization and float32 compute dtype. Its primary characteristic is the application of specific quantization techniques during its training process, making it suitable for environments where efficient model deployment is critical.

Loading preview...