Model Overview
FlyPig23/Llama3.2-3B_Paper_Impact_dataset_SFT_1ep is a 3.2 billion parameter language model derived from the meta-llama/Llama-3.2-3B-Instruct architecture. It has been fine-tuned for one epoch on the paper_impact_dataset_train dataset, achieving a loss of 0.0990 on the evaluation set.
Key Characteristics
- Base Model: Fine-tuned from
meta-llama/Llama-3.2-3B-Instruct. - Parameter Count: 3.2 billion parameters.
- Context Length: Supports a context length of 32768 tokens.
- Specialization: Adapted using the
paper_impact_dataset_train, suggesting a focus on tasks related to academic paper analysis, understanding, or impact assessment.
Training Details
The model was trained with a learning rate of 2e-05, a train_batch_size of 8, and a gradient_accumulation_steps of 4, resulting in a total_train_batch_size of 128. The optimizer used was adamw_torch with a cosine learning rate scheduler and a warmup ratio of 0.1 over 1.0 epoch.
Potential Use Cases
Given its fine-tuning on a paper impact dataset, this model is likely suitable for applications such as:
- Analyzing the influence or citations of academic papers.
- Summarizing research findings from scholarly articles.
- Extracting key information or trends from scientific literature.
- Assisting in literature reviews or bibliometric analysis.