Name: mlfoundations-dev/qwen_lawma_filtered_deepseek-2k-5x API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: mlfoundations-dev

Overview

This model, mlfoundations-dev/qwen_lawma_filtered_deepseek-2k-5x, is a 7.6 billion parameter language model. It is a fine-tuned variant of the established Qwen/Qwen2.5-7B-Instruct architecture, indicating a strong foundation in general language understanding and instruction following.

Key Characteristics

Base Model: Fine-tuned from Qwen/Qwen2.5-7B-Instruct.
Parameter Count: 7.6 billion parameters.
Context Length: Supports a substantial context window of 32,768 tokens.
Training Data: Specialized fine-tuning on the mlfoundations-dev/lawma-annotations-deepseek-2k-5x-deepseek-verified-share-gpt dataset.

Training Details

The model underwent training with specific hyperparameters:

Learning Rate: 1e-05
Batch Size: A total training batch size of 16 (2 per device with 4 devices and 2 gradient accumulation steps).
Optimizer: AdamW with cosine learning rate scheduler and 0.1 warmup ratio.
Epochs: Trained for 5.0 epochs.

Potential Use Cases

Given its fine-tuning on a specific dataset, this model is likely best suited for applications that align with the characteristics and content of the lawma-annotations-deepseek-2k-5x-deepseek-verified-share-gpt dataset. Developers should evaluate its performance for tasks requiring specialized knowledge or patterns present in its training data.

Overview

Overview

Key Characteristics

Training Details

Potential Use Cases

Full Model Card (README)