mlfoundations-dev/deepspeed_no_offload_liger_packing
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kLicense:apache-2.0Architecture:Transformer Open Weights Cold

The mlfoundations-dev/deepspeed_no_offload_liger_packing model is a 7.6 billion parameter instruction-tuned causal language model, fine-tuned from Qwen/Qwen2.5-7B-Instruct. It was trained on the mlfoundations-dev/wikipedia_seed_science dataset. This model is designed for general language generation tasks, leveraging its Qwen2.5 base architecture and specialized fine-tuning.

Loading preview...