raca-workspace-v1/grpo-tool-sat-sft-qwen3-1p7b-sft-20260419-075623-96e9
The raca-workspace-v1/grpo-tool-sat-sft-qwen3-1p7b-sft-20260419-075623-96e9 model is a 2 billion parameter language model fine-tuned from Qwen/Qwen3-1.7B-Base. It was trained on the grpo_tool_sat_sft dataset, suggesting specialization for tasks related to that specific dataset. With a context length of 32768 tokens, it is designed for applications requiring processing of extensive input sequences.
Loading preview...
Overview
This model, raca-workspace-v1/grpo-tool-sat-sft-qwen3-1p7b-sft-20260419-075623-96e9, is a fine-tuned variant of the Qwen3-1.7B-Base architecture. It features approximately 2 billion parameters and supports a substantial context length of 32768 tokens, enabling it to handle long-form inputs and complex tasks.
Key Characteristics
- Base Model: Fine-tuned from Qwen/Qwen3-1.7B-Base.
- Parameter Count: Approximately 2 billion parameters.
- Context Length: Supports a large context window of 32768 tokens.
- Training Data: Specifically fine-tuned on the
grpo_tool_sat_sftdataset.
Training Details
The model underwent training with a learning rate of 2e-05, a batch size of 8 (total effective batch size of 16 with gradient accumulation), and utilized the AdamW optimizer. The training process spanned 2 epochs with a cosine learning rate scheduler and a warmup ratio of 0.03. This fine-tuning process suggests an optimization for tasks relevant to the grpo_tool_sat_sft dataset, differentiating it from the base Qwen3-1.7B model.