RedHatAI/Qwen2-7B-Instruct-FP8
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Jun 14, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Loading

RedHatAI/Qwen2-7B-Instruct-FP8 is a 7.6 billion parameter Qwen2-based instruction-tuned causal language model developed by Neural Magic. This model is an FP8 quantized version of Qwen2-7B-Instruct, optimized for reduced disk size and GPU memory requirements. It is intended for commercial and research use in English, specifically for assistant-like chat applications, while maintaining nearly identical performance to its unquantized counterpart.

Loading preview...