Name: DCAgent2/nl2bash-stack-bugsseq API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: DCAgent2

Overview

The nl2bash-stack-bugsseq model, developed by DCAgent2, is an 8 billion parameter language model with a substantial context length of 32768 tokens. It was trained from scratch, though specific details about its architecture and the dataset used are not provided in the current documentation.

Training Details

The model underwent training with a learning rate of 4e-05, a train_batch_size of 1, and gradient_accumulation_steps of 2, resulting in a total_train_batch_size of 16 across 8 GPUs. It utilized the AdamW_Torch_Fused optimizer with betas=(0.9, 0.98) and epsilon=1e-08. A cosine learning rate scheduler was employed with a warmup ratio of 0.1 over 7 epochs. The training environment included Transformers 4.56.1, Pytorch 2.9.1+cu128, Datasets 4.4.1, and Tokenizers 0.22.1.

Key Capabilities

Large Context Window: Supports a 32768 token context, enabling processing of extensive inputs.

Good for

Further fine-tuning or research where a base model with a large context window is required.
Exploration of models trained from scratch with specific hyperparameters.

Overview

Overview

Training Details

Key Capabilities

Good for

Full Model Card (README)