Name: adpretko/train-riscv-O2_epoch1and2 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: adpretko

Overview

The adpretko/train-riscv-O2_epoch1and2 model is a 1.5 billion parameter language model, fine-tuned from an existing checkpoint, saves/train-riscv-O2_epoch1and2/checkpoint-2800. It was trained for 2 epochs with a substantial context length of 131072 tokens, indicating potential for processing very long sequences.

Training Details

The model underwent training with the following key hyperparameters:

Learning Rate: 2e-05
Batch Size: 8 (train and eval), with a total effective training batch size of 512 due to gradient accumulation steps.
Optimizer: ADAMW_TORCH_FUSED
Scheduler: Cosine learning rate scheduler with a 0.1 warmup ratio.
Epochs: 2.0

Key Capabilities

Large Context Window: Supports a context length of 131072 tokens, enabling it to process and generate very long texts.
Fine-tuned Base: Built upon an existing checkpoint, suggesting a specialized application or domain.

Good for

Use cases requiring processing of extremely long input sequences.
Further experimentation or fine-tuning for specific tasks where its base model and training parameters are relevant.

Overview

Overview

Training Details

Key Capabilities

Good for

Full Model Card (README)