Name: FlyPig23/Llama3.2-3B_Paper_Impact_SFT API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: FlyPig23

Overview

FlyPig23/Llama3.2-3B_Paper_Impact_SFT is a 3.2 billion parameter language model, fine-tuned from the meta-llama/Llama-3.2-3B-Instruct base model. Its training specifically utilized the paper_impact_sft_train dataset, suggesting a specialization in tasks related to the impact of research papers.

Training Details

The model was trained with a learning rate of 2e-05 over 3 epochs, using a total batch size of 128 across 4 GPUs. The optimizer used was adamw_torch with a cosine learning rate scheduler and a warmup ratio of 0.1. During training, the validation loss decreased from 0.0733 at 500 steps to 0.1443 by 2000 steps, with the training loss reaching 0.005.

Key Characteristics

Base Model: Meta Llama-3.2-3B-Instruct
Parameter Count: 3.2 billion
Context Length: 32768 tokens
Specialization: Fine-tuned on paper_impact_sft_train dataset, indicating a focus on tasks related to research paper impact.

Potential Use Cases

This model is likely best suited for applications requiring an understanding or generation of text concerning the influence, significance, or reception of academic papers. This could include tasks such as summarizing paper impact, identifying key contributions, or analyzing citation contexts.

Overview

Overview

Training Details

Key Characteristics

Potential Use Cases

Full Model Card (README)