xiaolesu/OsmosisProofling-SFT-NT-GRPO-TK-V2

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 10, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The xiaolesu/OsmosisProofling-SFT-NT-GRPO-TK-V2 is an 8 billion parameter causal language model developed by xiaolesu, fine-tuned from Qwen/Qwen3-8B. This model is specifically fine-tuned on the OsmosisProofling-SFT dataset, demonstrating a validation perplexity of 1.4252. It is built using the Axolotl framework and incorporates Liger plugin features like rope, rms_norm, and glu_activation. This model is optimized for tasks aligned with its fine-tuning dataset, showing strong performance in terms of loss and perplexity.

Loading preview...

Overview

The xiaolesu/OsmosisProofling-SFT-NT-GRPO-TK-V2 is an 8 billion parameter language model, fine-tuned by xiaolesu from the base model Qwen/Qwen3-8B. It was developed using the Axolotl framework (version 0.16.0.dev0) and leverages several Liger plugin features, including liger_rope, liger_rms_norm, and liger_glu_activation, which enhance its architectural components.

Key Capabilities

  • Fine-tuned Performance: Achieves a validation loss of 0.3543 and a perplexity (PPL) of 1.4252 on its evaluation set, indicating strong performance on the specific tasks it was trained for.
  • Efficient Training: Utilizes adamw_torch_fused optimizer, cosine learning rate scheduler, and bf16 precision, along with FSDP for distributed training across multiple GPUs.
  • Context Length: Supports a sequence length of 4096 tokens, suitable for processing moderately long inputs.

Good For

  • Specialized Tasks: Ideal for applications requiring high accuracy on tasks similar to those found in the xiaolesu/OsmosisProofling-SFT dataset.
  • Research and Development: Provides a solid base for further experimentation and fine-tuning, especially for those interested in the Qwen3 architecture combined with Liger optimizations.
  • Resource-Efficient Deployment: With 8 billion parameters, it offers a balance between performance and computational requirements, making it suitable for environments where larger models might be prohibitive.