Name: Jeesup/tofu_Llama-3.2-1B-Instruct_forget10_NPO_qat-int4 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Jeesup

Model Overview

Jeesup/tofu_Llama-3.2-1B-Instruct_forget10_NPO_qat-int4 is a 1 billion parameter instruction-tuned language model. It is a fine-tuned variant of the open-unlearning/tofu_Llama-3.2-1B-Instruct_full base model, indicating a focus on specific instruction-following capabilities. The model supports a substantial context length of 32768 tokens.

Training Details

The model underwent training with the following key hyperparameters:

Learning Rate: 1e-05
Batch Sizes: train_batch_size of 4, eval_batch_size of 16, and a total_train_batch_size of 16 (with gradient_accumulation_steps of 4).
Optimizer: Paged AdamW with default betas and epsilon.
Scheduler: Linear learning rate scheduler with 25 warmup steps.
Epochs: 10 training epochs.

Current Status and Limitations

As per the provided information, specific details regarding the dataset used for fine-tuning, the model's intended uses, and its limitations are not yet available. This suggests the model might be in an experimental phase or designed for a highly specialized, undocumented purpose. Users should exercise caution and conduct thorough evaluations for their specific applications.

Overview

Model Overview

Training Details

Current Status and Limitations

Full Model Card (README)