cjiao/OpenThoughts3-greedy-groups-top-openthinker3-1.5B-checkpoint-375

TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Apr 12, 2026Architecture:Transformer Cold

The cjiao/OpenThoughts3-greedy-groups-top-openthinker3-1.5B-checkpoint-375 is a 1.5 billion parameter language model, fine-tuned from cjiao/OpenThinker3-1.5B-checkpoint-375. It was trained on the cjiao/OpenThoughts3-greedy-groups-top-openthinker3-1.5B-checkpoint-375 dataset with a context length of 32768 tokens. This model is a specialized iteration, though its specific differentiators and primary use cases require further documentation from the developer.

Loading preview...

Model Overview

The cjiao/OpenThoughts3-greedy-groups-top-openthinker3-1.5B-checkpoint-375 is a 1.5 billion parameter language model. It is a fine-tuned version of the cjiao/OpenThinker3-1.5B-checkpoint-375 base model, specifically adapted using the cjiao/OpenThoughts3-greedy-groups-top-openthinker3-1.5B-checkpoint-375 dataset.

Training Details

The model was trained for 1 epoch with a learning rate of 0.00016, using a total batch size of 256 (achieved with train_batch_size 8 and gradient_accumulation_steps 16) across 2 GPUs. The optimizer used was adamw_torch with standard betas and epsilon, and a cosine learning rate scheduler. The training utilized Transformers 4.46.1, Pytorch 2.5.1+cu121, Datasets 3.1.0, and Tokenizers 0.20.3.

Current Status

As of this release, specific details regarding the model's intended uses, limitations, and detailed evaluation data are not yet provided by the developer. Users should refer to future updates for more comprehensive information on its capabilities and optimal applications.