FlyPig23/Llama3.2-3B_Paper_Impact_media_SFT_1ep
FlyPig23/Llama3.2-3B_Paper_Impact_media_SFT_1ep is a 3.2 billion parameter language model fine-tuned from Meta's Llama-3.2-3B-Instruct. This model is specifically adapted using the paper_impact_media_train dataset, focusing on tasks related to the impact of academic papers and media. It is designed for specialized applications requiring understanding and generation within this domain, leveraging its Llama 3.2 architecture and a 32768 token context length.
Loading preview...
Model Overview
FlyPig23/Llama3.2-3B_Paper_Impact_media_SFT_1ep is a 3.2 billion parameter language model, fine-tuned from the meta-llama/Llama-3.2-3B-Instruct base model. It was trained for a single epoch on the paper_impact_media_train dataset, achieving a loss of 0.0766 on the evaluation set. This specialization suggests its utility in tasks related to analyzing or generating content concerning the impact of academic papers and media.
Key Training Details
The model was trained with the following hyperparameters:
- Learning Rate: 2e-05
- Batch Size: 8 (train), 8 (eval)
- Optimizer: AdamW with default betas and epsilon
- LR Scheduler: Cosine type with a 0.1 warmup ratio
- Epochs: 1.0
Intended Use Cases
This model is primarily intended for applications that benefit from its specific fine-tuning on the paper_impact_media_train dataset. Developers should consider this model for tasks involving:
- Analyzing the influence or reception of academic papers.
- Processing or generating content related to media impact.
- Specialized natural language understanding within the academic and media domains.