Name: mlfoundations-dev/qwen2-5_nemotron-sft_100000 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: mlfoundations-dev

Overview

This model, mlfoundations-dev/qwen2-5_nemotron-sft_100000, is a fine-tuned variant of the Qwen/Qwen2.5-7B-Instruct base model. It features 7.6 billion parameters and supports a 32,768 token context length, making it suitable for processing moderately long inputs and generating comprehensive responses. The fine-tuning process utilized the mlfoundations-dev/nemotron-sft_100000 dataset, which suggests an optimization for specific instruction-following or conversational capabilities, though further details on the dataset's nature are not provided.

Training Details

The model was trained with a learning rate of 8e-05, a total batch size of 512 (achieved with a train_batch_size of 1 and gradient_accumulation_steps of 16 across 32 devices), and a cosine learning rate scheduler with a 0.1 warmup ratio over 3 epochs. The optimizer used was AdamW with default betas and epsilon.

Intended Uses

Given its instruction-tuned base and fine-tuning, this model is generally suitable for a range of natural language processing tasks, including but not limited to:

Instruction following
Text generation
Question answering
Summarization

Limitations

Specific limitations are not detailed in the provided information. Users should be aware that, like all large language models, it may exhibit biases present in its training data and can occasionally generate factually incorrect or nonsensical outputs.

Overview

Overview

Training Details

Intended Uses

Limitations

Full Model Card (README)