NicoHelemon/MNLP_M2_mcqa_model
TEXT GENERATIONConcurrency Cost:1Model Size:0.8BQuant:BF16Ctx Length:32kPublished:May 24, 2025License:apache-2.0Architecture:Transformer Open Weights Cold

NicoHelemon/MNLP_M2_mcqa_model is a 0.8 billion parameter language model fine-tuned from unsloth/qwen3-0.6b-base-unsloth-bnb-4bit. This model was trained with a learning rate of 3e-05 over 3 epochs, utilizing a cosine learning rate scheduler. Its specific capabilities and intended uses are not detailed, as it was fine-tuned on an unknown dataset.

Loading preview...

Model Overview

NicoHelemon/MNLP_M2_mcqa_model is a fine-tuned language model based on the unsloth/qwen3-0.6b-base-unsloth-bnb-4bit architecture. With approximately 0.8 billion parameters, this model underwent a specific training regimen, though its exact capabilities and the dataset used for fine-tuning remain unspecified in the provided documentation.

Training Details

The model was trained using the following key hyperparameters:

  • Learning Rate: 3e-05
  • Batch Size: 256 (train), 8 (eval)
  • Epochs: 3
  • Optimizer: ADAMW_TORCH with default betas and epsilon
  • LR Scheduler: Cosine type with a warmup ratio of 0.1

Limitations and Use Cases

Due to the lack of detailed information regarding its fine-tuning dataset and specific evaluation results, the intended uses and limitations of this model are currently unknown. Developers should exercise caution and conduct thorough testing to determine its suitability for any particular application.