Name: Rakancorle1/qwen2.5-3b_Instruct_policy_traj_30k_full API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Rakancorle1

Model Overview

This model, Rakancorle1/qwen2.5-3b_Instruct_policy_traj_30k_full, is a specialized instruction-tuned language model based on the Qwen2.5-3B-Instruct architecture. It features approximately 3.1 billion parameters and a context length of 32,768 tokens, making it suitable for processing moderately long sequences.

Key Specialization

The primary differentiator for this model is its fine-tuning on the Policy_Traj_0826_30k_train dataset. This targeted training suggests an optimization for tasks that involve understanding, generating, or following specific policy-related trajectories or sequences of actions.

Training Details

The fine-tuning process utilized the following hyperparameters:

Learning Rate: 1e-05
Batch Size: 2 (train), 8 (eval)
Gradient Accumulation: 8 steps, leading to a total effective batch size of 64
Optimizer: AdamW with cosine learning rate scheduler and 0.1 warmup ratio
Epochs: 3.0

Potential Use Cases

Given its fine-tuning on a policy trajectory dataset, this model could be particularly useful for applications such as:

Generating responses or actions that align with predefined policies.
Simulating policy-driven behaviors.
Analyzing sequences of events in the context of specific policies.

Limitations

As indicated in the original model card, more information is needed regarding its intended uses, limitations, and detailed training/evaluation data. Users should conduct thorough testing for their specific applications.

Overview

Model Overview

Key Specialization

Training Details

Potential Use Cases

Limitations

Full Model Card (README)