FlyPig23/Llama3.2-3B_Paper_Impact_dataset_SFT_1ep

TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Apr 7, 2026License:otherArchitecture:Transformer Cold

FlyPig23/Llama3.2-3B_Paper_Impact_dataset_SFT_1ep is a 3.2 billion parameter language model, fine-tuned from meta-llama/Llama-3.2-3B-Instruct. This model is specifically adapted using the paper_impact_dataset_train, indicating a specialization in tasks related to academic paper analysis or impact assessment. It is designed for applications requiring nuanced understanding or generation based on scholarly text data.

Loading preview...

Model Overview

FlyPig23/Llama3.2-3B_Paper_Impact_dataset_SFT_1ep is a 3.2 billion parameter language model derived from the meta-llama/Llama-3.2-3B-Instruct architecture. It has been fine-tuned for one epoch on the paper_impact_dataset_train dataset, achieving a loss of 0.0990 on the evaluation set.

Key Characteristics

  • Base Model: Fine-tuned from meta-llama/Llama-3.2-3B-Instruct.
  • Parameter Count: 3.2 billion parameters.
  • Context Length: Supports a context length of 32768 tokens.
  • Specialization: Adapted using the paper_impact_dataset_train, suggesting a focus on tasks related to academic paper analysis, understanding, or impact assessment.

Training Details

The model was trained with a learning rate of 2e-05, a train_batch_size of 8, and a gradient_accumulation_steps of 4, resulting in a total_train_batch_size of 128. The optimizer used was adamw_torch with a cosine learning rate scheduler and a warmup ratio of 0.1 over 1.0 epoch.

Potential Use Cases

Given its fine-tuning on a paper impact dataset, this model is likely suitable for applications such as:

  • Analyzing the influence or citations of academic papers.
  • Summarizing research findings from scholarly articles.
  • Extracting key information or trends from scientific literature.
  • Assisting in literature reviews or bibliometric analysis.