Name: sagnikM/grpo_sgd_llama3p1_8b_3k-seqlen_momentum_0p9_1e-3 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: sagnikM

Model Overview

This model, sagnikM/grpo_sgd_llama3p1_8b_3k-seqlen_momentum_0p9_1e-3, is an 8 billion parameter language model. It is likely based on the Llama 3.1 architecture, as indicated by its name, and features a notable context length of 32,768 tokens. The model's training incorporates specific optimization techniques, including GRPO SGD with a momentum of 0.9 and a learning rate of 1e-3.

Key Characteristics

Parameter Count: 8 billion parameters.
Context Length: Supports a substantial context window of 32,768 tokens.
Training Methodology: Utilizes GRPO SGD with specific hyperparameters (momentum 0.9, learning rate 1e-3).

Current Status

The provided model card indicates that detailed information regarding its development, specific language support, license, fine-tuning origins, intended uses, biases, risks, limitations, training data, and evaluation results is currently marked as "More Information Needed." Users should be aware of these gaps when considering its application.

Overview

Model Overview

Key Characteristics

Current Status

Full Model Card (README)