xiaolesu/OsmosisProofling-SFT-NT-GRPO-NT-V2

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 10, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The xiaolesu/OsmosisProofling-SFT-NT-GRPO-NT-V2 is an 8 billion parameter language model fine-tuned from Qwen/Qwen3-8B. Developed by xiaolesu, this model was trained using the Axolotl framework on the xiaolesu/OsmosisProofling-SFT dataset. It demonstrates a validation perplexity of 1.4252 and is optimized for tasks related to its specific fine-tuning data.

Loading preview...

Model Overview

The xiaolesu/OsmosisProofling-SFT-NT-GRPO-NT-V2 is an 8 billion parameter language model, fine-tuned from the Qwen/Qwen3-8B base model. This iteration was developed by xiaolesu using the Axolotl training framework, specifically leveraging the xiaolesu/OsmosisProofling-SFT dataset for supervised fine-tuning.

Key Training Details

  • Base Model: Qwen/Qwen3-8B
  • Fine-tuning Dataset: xiaolesu/OsmosisProofling-SFT (Alpaca format)
  • Training Framework: Axolotl (version 0.16.0.dev0)
  • Context Length: Configured for a sequence length of 4096 tokens during training.
  • Optimization: Utilizes adamw_torch_fused optimizer with a learning rate of 1e-5 and a cosine learning rate scheduler.
  • Validation Performance: Achieved a final validation loss of 0.3543 and a perplexity (PPL) of 1.4252 on the evaluation set.

Intended Use Cases

This model is specifically adapted through its fine-tuning process on the OsmosisProofling-SFT dataset. While specific intended uses and limitations are not detailed in the provided information, its performance metrics suggest it is well-suited for tasks aligned with the characteristics and content of its training data. Users should evaluate its suitability for their specific applications based on the nature of the OsmosisProofling-SFT dataset.