Name: formalmathatepfl/apertus-cpt-sft-classic API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: formalmathatepfl

Model Overview

The formalmathatepfl/apertus-cpt-sft-classic is an 8 billion parameter language model, derived from the swiss-ai/Apertus-8B-2509 base model. It has undergone supervised fine-tuning (SFT) on a specific dataset, achieving a validation loss of 0.0877. The training utilized a cosine learning rate scheduler with a warmup ratio of 0.05 over 1 epoch, employing an AdamW optimizer.

Key Training Details

Base Model: swiss-ai/Apertus-8B-2509
Fine-tuning Method: Supervised Fine-Tuning (SFT)
Parameters: 8 Billion
Context Length: 32768 tokens
Final Validation Loss: 0.0877
Optimizer: AdamW_TORCH with betas=(0.9, 0.999) and epsilon=1e-08
Learning Rate: 1e-05
Epochs: 1.0

Intended Uses & Limitations

As a supervised fine-tuned model, its primary utility lies in tasks aligned with its training data. Specific intended uses and limitations are not detailed in the provided information, suggesting further evaluation or documentation is needed for optimal application guidance. Users should consider the fine-tuning objective when deploying this model.

Overview

Model Overview

Key Training Details

Intended Uses & Limitations

Full Model Card (README)