Name: PARZ2344/web_llama_sft_random API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: PARZ2344

Model Overview

PARZ2344/web_llama_sft_random is a 3.2 billion parameter language model, fine-tuned from the meta-llama/Llama-3.2-3B-Instruct base model. This instruction-tuned variant was trained on the deep_research_25 dataset, indicating its specialization for tasks related to the characteristics of this specific data. The model was trained using a learning rate of 1e-05 over 3 epochs, with a total batch size of 64 across 8 GPUs, utilizing a cosine learning rate scheduler with a 0.1 warmup ratio.

Key Characteristics

Base Model: Fine-tuned from Meta Llama 3.2-3B-Instruct.
Parameter Count: 3.2 billion parameters, offering a balance between performance and computational efficiency.
Context Length: Supports a substantial context window of 32768 tokens.
Training Data: Specialized fine-tuning on the deep_research_25 dataset.
Training Configuration: Utilized AdamW optimizer, multi-GPU distributed training, and gradient accumulation for stable and efficient learning.

Intended Use Cases

This model is particularly suited for applications that align with the nature of the deep_research_25 dataset it was fine-tuned on. Developers should consider its specific training data and instruction-following capabilities when evaluating its suitability for their tasks. Its compact size and substantial context length make it a viable option for scenarios requiring efficient processing of detailed instructions or long-form content within its specialized domain.

Overview

Model Overview

Key Characteristics

Intended Use Cases

Full Model Card (README)