xiaolesu/OsmosisProofling-SFT-NT-GRPO-NT-V2
The xiaolesu/OsmosisProofling-SFT-NT-GRPO-NT-V2 is an 8 billion parameter language model fine-tuned from Qwen/Qwen3-8B. Developed by xiaolesu, this model was trained using the Axolotl framework on the xiaolesu/OsmosisProofling-SFT dataset. It demonstrates a validation perplexity of 1.4252 and is optimized for tasks related to its specific fine-tuning data.
Loading preview...
Model Overview
The xiaolesu/OsmosisProofling-SFT-NT-GRPO-NT-V2 is an 8 billion parameter language model, fine-tuned from the Qwen/Qwen3-8B base model. This iteration was developed by xiaolesu using the Axolotl training framework, specifically leveraging the xiaolesu/OsmosisProofling-SFT dataset for supervised fine-tuning.
Key Training Details
- Base Model: Qwen/Qwen3-8B
- Fine-tuning Dataset:
xiaolesu/OsmosisProofling-SFT(Alpaca format) - Training Framework: Axolotl (version 0.16.0.dev0)
- Context Length: Configured for a sequence length of 4096 tokens during training.
- Optimization: Utilizes
adamw_torch_fusedoptimizer with a learning rate of 1e-5 and a cosine learning rate scheduler. - Validation Performance: Achieved a final validation loss of 0.3543 and a perplexity (PPL) of 1.4252 on the evaluation set.
Intended Use Cases
This model is specifically adapted through its fine-tuning process on the OsmosisProofling-SFT dataset. While specific intended uses and limitations are not detailed in the provided information, its performance metrics suggest it is well-suited for tasks aligned with the characteristics and content of its training data. Users should evaluate its suitability for their specific applications based on the nature of the OsmosisProofling-SFT dataset.