Name: Yaseal/llama3_1b_instruct_vallina_full_sft_30k API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Yaseal

Model Overview

Yaseal/llama3_1b_instruct_vallina_full_sft_30k is a 1 billion parameter instruction-tuned model, derived from the LLM-Research/Llama-3.2-1B-Instruct base architecture. This model has been specifically fine-tuned on the deepmath_plain_30k_train dataset, indicating a specialization towards tasks related to the content of this dataset.

Key Characteristics

Base Model: LLM-Research/Llama-3.2-1B-Instruct
Parameter Count: 1 billion parameters
Context Length: 32768 tokens
Training Data: Fine-tuned on deepmath_plain_30k_train
Performance: Achieved a validation loss of 0.5760 during training.

Training Details

The model was trained with a learning rate of 2e-05, using an AdamW optimizer and a cosine learning rate scheduler with a 0.1 warmup ratio. The training involved 2 epochs with a total batch size of 16 across 2 GPUs. This fine-tuning process aims to adapt the base Llama-3.2-1B-Instruct model to specific instruction-following capabilities as defined by the deepmath_plain_30k_train dataset.

Potential Use Cases

Given its fine-tuning on a specific dataset, this model is likely best suited for:

Applications requiring instruction-following capabilities aligned with the deepmath_plain_30k_train dataset's domain.
Scenarios where a compact, 1B parameter model is preferred for efficiency while still offering specialized instruction-tuned performance.

Overview

Model Overview

Key Characteristics

Training Details

Potential Use Cases

Full Model Card (README)