Name: FlyPig23/Llama3.2-3B_Paper_Impact_award_SFT_1ep API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: FlyPig23

Model Overview

FlyPig23/Llama3.2-3B_Paper_Impact_award_SFT_1ep is a specialized language model derived from the Meta Llama-3.2-3B-Instruct architecture. This 3.2 billion parameter model has been fine-tuned for a single epoch on the paper_impact_award_train dataset, achieving an evaluation loss of 0.0734. It leverages a substantial context length of 32768 tokens, making it suitable for processing longer inputs relevant to its training data.

Key Training Details

Base Model: meta-llama/Llama-3.2-3B-Instruct
Fine-tuning Dataset: paper_impact_award_train
Epochs: 1.0
Learning Rate: 2e-05
Batch Size: 8 (train and eval), with a total effective train batch size of 128 due to gradient accumulation.
Optimizer: AdamW with cosine learning rate scheduler and 0.1 warmup ratio.

Potential Use Cases

Given its specific fine-tuning, this model is likely best suited for tasks closely aligned with the paper_impact_award_train dataset. Developers should consider its specialized training for applications requiring nuanced understanding or generation within that particular domain.

Overview

Model Overview

Key Training Details

Potential Use Cases

Full Model Card (README)