Name: mlfoundations-dev/oh-dcft-v3.1-SN-405B-hacky-qwen API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: mlfoundations-dev

Model Overview

This model, oh-dcft-v3.1-SN-405B-hacky-qwen, is a fine-tuned variant of the Qwen/Qwen2.5-7B base model, developed by mlfoundations-dev. It leverages the Qwen2.5 architecture, featuring 7.6 billion parameters and a substantial context length of 131072 tokens.

Key Characteristics

Base Model: Qwen/Qwen2.5-7B
Fine-tuning Dataset: mlfoundations-dev/oh-dcft-v3.1-SN-405B-hacky
Training Objective: Achieved a final validation loss of 0.5737 over 3 epochs.
Hyperparameters: Trained with a learning rate of 5e-06, a total batch size of 128, and AdamW optimizer.

Potential Use Cases

Given its fine-tuning on a specific dataset, this model is likely best suited for applications that align with the characteristics and domain of the mlfoundations-dev/oh-dcft-v3.1-SN-405B-hacky dataset. Developers should evaluate its performance on tasks similar to its training data to determine suitability.

Overview

Model Overview

Key Characteristics

Potential Use Cases

Full Model Card (README)