cjiao/OpenThinker3-1.5B-test
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Apr 10, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The cjiao/OpenThinker3-1.5B-test is a 1.5 billion parameter language model, fine-tuned from Qwen/Qwen2.5-1.5B-Instruct. It was trained on the open-thoughts/OpenThoughts-114k dataset, leveraging a 32768-token context length. This model is designed for general language understanding and generation tasks, building upon the Qwen2.5 architecture.

Loading preview...

Overview

cjiao/OpenThinker3-1.5B-test is a 1.5 billion parameter language model, fine-tuned from the Qwen/Qwen2.5-1.5B-Instruct base model. It utilizes a substantial context length of 32768 tokens, making it suitable for processing longer inputs and generating coherent, extended responses. The model's training involved a specific dataset, open-thoughts/OpenThoughts-114k, which suggests a focus on general conversational or thought-processing tasks.

Training Details

The fine-tuning process for OpenThinker3-1.5B-test involved specific hyperparameters:

  • Learning Rate: 0.00016
  • Batch Size: 8 (train), 8 (eval)
  • Gradient Accumulation Steps: 16
  • Optimizer: AdamW with default betas and epsilon
  • Scheduler: Cosine learning rate scheduler with 0.1 warmup ratio
  • Training Steps: 10

This configuration indicates a focused fine-tuning effort on the specified dataset. The model was trained using Transformers 4.46.1 and PyTorch 2.5.1+cu121.