TarhanE/sft-count_loss-Qwen3-0.6B-mle0.5-ul0.5-tox0-e4
TEXT GENERATIONConcurrency Cost:1Model Size:0.8BQuant:BF16Ctx Length:32kPublished:Jun 9, 2025License:apache-2.0Architecture:Transformer Open Weights Cold

The TarhanE/sft-count_loss-Qwen3-0.6B-mle0.5-ul0.5-tox0-e4 model is a 0.8 billion parameter language model, fine-tuned from the Qwen/Qwen3-0.6B architecture. This model was trained on an unspecified dataset, achieving a validation loss of 1.9505. It is a specialized fine-tuned variant, though its specific primary differentiator and intended use cases are not detailed in the available information.

Loading preview...