Name: sagnikM/grpo_rmsprop_qwen3_1p7b_3k_seqlen_1e-6 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: sagnikM

Model Overview

The sagnikM/grpo_rmsprop_qwen3_1p7b_3k_seqlen_1e-6 model is a 2 billion parameter language model developed by sagnikM. This model is based on the Qwen3 architecture and has undergone specific training with GRPO (Generalized Reinforcement Learning with Policy Optimization) and RMSprop optimizers. It was trained with a notable sequence length of 3000 tokens and a learning rate of 1e-6.

Key Characteristics

Architecture: Qwen3 base model.
Parameter Count: 2 billion parameters.
Training Optimization: Utilizes GRPO and RMSprop for fine-tuning.
Sequence Length: Trained with a context window of 3000 tokens.
Learning Rate: A learning rate of 1e-6 was applied during training.

Good for

Given the limited information in the model card, the specific strengths and ideal use cases are not explicitly detailed. However, models of this size and architecture are generally suitable for:

General text generation and completion.
Experimentation with GRPO and RMSprop optimization techniques.
Research into the effects of specific training parameters on Qwen3-based models.

Overview

Model Overview

Key Characteristics

Good for

Full Model Card (README)