Name: tungpl/gsm8k-llama3-grpo API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: tungpl

Model Overview

The tungpl/gsm8k-llama3-grpo is a 3.2 billion parameter language model, fine-tuned from the unsloth/Llama-3.2-3B-Instruct base model. Developed by tungpl, this model leverages the Llama 3 architecture and was optimized for training speed using the Unsloth library in conjunction with Huggingface's TRL library.

Key Characteristics

Base Model: Fine-tuned from unsloth/Llama-3.2-3B-Instruct.
Training Efficiency: Utilizes Unsloth for 2x faster fine-tuning.
Parameter Count: 3.2 billion parameters, offering a balance between performance and computational efficiency.
Context Length: Supports a context length of 32768 tokens.
License: Released under the Apache-2.0 license.

Use Cases

This model is suitable for developers looking for a Llama 3-based instruction-tuned model that benefits from optimized training. Its compact size makes it a good candidate for applications where resource efficiency is important, while its Llama 3 foundation provides strong language understanding capabilities. It is particularly relevant for tasks that align with its fine-tuning objectives, though specific task performance would require further evaluation.

Overview

Model Overview

Key Characteristics

Use Cases

Full Model Card (README)