Name: mlfoundations-dev/qwen_lawma_deepseek-2k-5x-majority_verified API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: mlfoundations-dev

Overview

This model, qwen_lawma_deepseek-2k-5x-majority_verified, is a fine-tuned variant of the Qwen/Qwen2.5-7B-Instruct base model. It features 7.6 billion parameters and supports a substantial context length of 131,072 tokens, making it suitable for processing extensive inputs.

Training Details

The model was fine-tuned using the mlfoundations-dev/thoughts-lawma-annotations-deepseek-majority-verified-share-gpt dataset. Key training hyperparameters included a learning rate of 1e-05, a total batch size of 16 (with 2 per device and 2 gradient accumulation steps), and 5.0 epochs. The optimizer used was adamw_torch with standard betas and epsilon, and a cosine learning rate scheduler with a 0.1 warmup ratio.

Potential Use Cases

Given its fine-tuning on a specific annotation dataset, this model is likely best suited for applications that involve:

Processing or generating text related to the domain covered by the thoughts-lawma-annotations-deepseek-majority-verified-share-gpt dataset.
Tasks requiring a large context window for understanding long-form content or complex interactions.

Limitations

The model description and intended uses sections in the original README indicate that more information is needed regarding its specific capabilities and limitations. Users should conduct further evaluation to determine its suitability for particular applications.

Overview

Overview

Training Details

Potential Use Cases

Limitations

Full Model Card (README)