xiaolesu/OsmosisProofling-SFT-NT-GRPO-TK-V2
The xiaolesu/OsmosisProofling-SFT-NT-GRPO-TK-V2 is an 8 billion parameter causal language model developed by xiaolesu, fine-tuned from Qwen/Qwen3-8B. This model is specifically fine-tuned on the OsmosisProofling-SFT dataset, demonstrating a validation perplexity of 1.4252. It is built using the Axolotl framework and incorporates Liger plugin features like rope, rms_norm, and glu_activation. This model is optimized for tasks aligned with its fine-tuning dataset, showing strong performance in terms of loss and perplexity.
Loading preview...
Overview
The xiaolesu/OsmosisProofling-SFT-NT-GRPO-TK-V2 is an 8 billion parameter language model, fine-tuned by xiaolesu from the base model Qwen/Qwen3-8B. It was developed using the Axolotl framework (version 0.16.0.dev0) and leverages several Liger plugin features, including liger_rope, liger_rms_norm, and liger_glu_activation, which enhance its architectural components.
Key Capabilities
- Fine-tuned Performance: Achieves a validation loss of 0.3543 and a perplexity (PPL) of 1.4252 on its evaluation set, indicating strong performance on the specific tasks it was trained for.
- Efficient Training: Utilizes
adamw_torch_fusedoptimizer,cosinelearning rate scheduler, andbf16precision, along with FSDP for distributed training across multiple GPUs. - Context Length: Supports a sequence length of 4096 tokens, suitable for processing moderately long inputs.
Good For
- Specialized Tasks: Ideal for applications requiring high accuracy on tasks similar to those found in the
xiaolesu/OsmosisProofling-SFTdataset. - Research and Development: Provides a solid base for further experimentation and fine-tuning, especially for those interested in the Qwen3 architecture combined with Liger optimizations.
- Resource-Efficient Deployment: With 8 billion parameters, it offers a balance between performance and computational requirements, making it suitable for environments where larger models might be prohibitive.