Model Overview
The xiaolesu/OsmosisProofling-SFT-NT-GRPO-NT-No-Overlap is an 8 billion parameter language model, fine-tuned from the Qwen/Qwen3-8B base model. It was developed using the Axolotl framework, incorporating several Liger optimizations such as liger_rope, liger_rms_norm, liger_glu_activation, liger_layer_norm, and liger_fused_linear_cross_entropy for enhanced efficiency and performance.
Training Details
This model was fine-tuned on the xiaolesu/OsmosisProofling-v3-SFT dataset, utilizing an Alpaca-type format. Key training hyperparameters included a learning rate of 1e-05, a total batch size of 14 (across 7 GPUs), and 2 epochs. The training process achieved a final validation loss of 0.3543 and a perplexity (PPL) of 1.4252.
Key Characteristics
- Base Model: Qwen/Qwen3-8B
- Parameter Count: 8 billion
- Context Length: 32768 tokens
- Fine-tuning Framework: Axolotl with Liger optimizations
- Training Dataset: xiaolesu/OsmosisProofling-v3-SFT
- Performance: Achieved a validation loss of 0.3543 and PPL of 1.4252 on the evaluation set.
Potential Use Cases
Given its fine-tuning on the OsmosisProofling-v3-SFT dataset, this model is likely best suited for tasks aligned with the data's domain. Developers looking for a Qwen3-8B variant with specific fine-tuning characteristics and Liger-enhanced architecture may find this model beneficial.