Name: FlyPig23/Llama3.2-3B_Paper_Impact_citation_SFT_1ep API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: FlyPig23

Overview

This model, Llama3.2-3B_Paper_Impact_citation_SFT_1ep, is a specialized instruction-tuned variant of the meta-llama/Llama-3.2-3B-Instruct base model. With 3.2 billion parameters and a context length of 32768 tokens, it has been fine-tuned specifically on the paper_impact_citations_train dataset.

Key Capabilities

Specialized Fine-tuning: The model has undergone supervised fine-tuning (SFT) for 1 epoch on a dataset focused on paper impact and citations.
Performance: Achieved a low evaluation loss of 0.0836, indicating effective learning on its target domain.
Base Model: Built upon the robust Llama 3.2-3B-Instruct architecture, providing a strong foundation for language understanding and generation.

Training Details

The training process utilized specific hyperparameters:

Learning Rate: 2e-05
Batch Size: 8 (train and eval)
Optimizer: AdamW with default betas and epsilon
Scheduler: Cosine learning rate scheduler with 0.1 warmup ratio
Epochs: 1.0

Good For

Research tasks involving analysis of academic paper impact.
Applications requiring understanding or generation related to scientific citations.
Experiments with domain-specific fine-tuning on Llama 3.2-based models.

Overview

Overview

Key Capabilities

Training Details

Good For

Full Model Card (README)