Name: KasparZ/mtext-20251122_qwen3-14b-base_merged_modified_special API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: KasparZ

Model Overview

This model, KasparZ/mtext-20251122_qwen3-14b-base_merged_modified_special, is a 14 billion parameter causal language model. It has been fine-tuned using a LoRA configuration, targeting specific modules like q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, and down_proj, with additional modules embed_tokens and lm_head saved. The training process involved adding new special tokens (<|s|>, <|e|>) and resizing the token embeddings.

Training Details

Training Data: The model was trained on the KasparZ/mtext-111025 dataset.
LoRA Configuration: Key parameters include r=16, lora_alpha=32, lora_dropout=0.05, and use_rslora=True.
Hyperparameters: Training utilized a learning rate of 1e-4, num_train_epochs=2, gradient_accumulation_steps=8, and a warmup_ratio=0.03.
Context Length: The model supports a context length of 32768 tokens.

Potential Use Cases

While specific direct uses are not detailed, the model's causal language modeling objective and fine-tuning approach suggest applicability in:

Text generation and completion.
Tasks requiring understanding and generation within a large context window.
Further fine-tuning for specialized downstream NLP applications.

Overview

Model Overview

Training Details

Potential Use Cases

Full Model Card (README)