Name: cjiao/OpenThinker3-1.5B-test API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: cjiao

Overview

cjiao/OpenThinker3-1.5B-test is a 1.5 billion parameter language model, fine-tuned from the Qwen/Qwen2.5-1.5B-Instruct base model. It utilizes a substantial context length of 32768 tokens, making it suitable for processing longer inputs and generating coherent, extended responses. The model's training involved a specific dataset, open-thoughts/OpenThoughts-114k, which suggests a focus on general conversational or thought-processing tasks.

Training Details

The fine-tuning process for OpenThinker3-1.5B-test involved specific hyperparameters:

Learning Rate: 0.00016
Batch Size: 8 (train), 8 (eval)
Gradient Accumulation Steps: 16
Optimizer: AdamW with default betas and epsilon
Scheduler: Cosine learning rate scheduler with 0.1 warmup ratio
Training Steps: 10

This configuration indicates a focused fine-tuning effort on the specified dataset. The model was trained using Transformers 4.46.1 and PyTorch 2.5.1+cu121.

Overview

Overview

Training Details

Full Model Card (README)