Name: adpretko/train-riscv-O2_epoch3_AMD API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: adpretko

Overview

This model, adpretko/train-riscv-O2_epoch3_AMD, is a 1.5 billion parameter language model developed by adpretko. It represents a further fine-tuned iteration of the adpretko/train-riscv-O2_epoch1and2 model, specifically enhanced for RISC-V related tasks. The model was trained with a substantial context length of 32768 tokens, making it suitable for processing extensive code segments.

Key Capabilities

RISC-V Code Specialization: Fine-tuned across 50 parts of the AnghaBench-risc-o2-full dataset, indicating a strong focus on RISC-V assembly and related code.
Extended Context Window: Features a 32768 token context length, allowing for the analysis and generation of longer code sequences or detailed technical documentation.

Training Details

The model underwent training with a learning rate of 2e-05, a train_batch_size of 8, and a gradient_accumulation_steps of 8, resulting in an effective total_train_batch_size of 512. It utilized the AdamW_Torch_Fused optimizer and a cosine learning rate scheduler with a 0.1 warmup ratio over 2 epochs. The training environment included Transformers 4.55.0, Pytorch 2.8.0+rocm6.3, Datasets 3.6.0, and Tokenizers 0.21.1.

Overview

Overview

Key Capabilities

Training Details

Full Model Card (README)