Name: KasparZ/mtext-20251122_qwen3-14b-base_merged API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: KasparZ

Model Overview

This model, KasparZ/mtext-20251122_qwen3-14b-base_merged, is a 14 billion parameter causal language model. It has been fine-tuned using a LoRA configuration, targeting specific modules such as q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, and down_proj, with embed_tokens and lm_head modules saved. The training process involved a custom dataset, KasparZ/mtext-111025, and specific preprocessing steps including adding new special tokens (<|s|>, <|e|>) and adjusting tokenizer padding.

Key Training Details

LoRA Configuration: r=16, lora_alpha=32, lora_dropout=0.05, use_rslora=True.
Hyperparameters: per_device_train_batch_size=1, gradient_accumulation_steps=8, num_train_epochs=2, learning_rate=1e-4, weight_decay=0.01, max_grad_norm=0.5.
Context Length: The model supports a context length of 32768 tokens.

Potential Use Cases

While specific direct use cases are not detailed, the model's architecture and training suggest suitability for:

Causal language modeling tasks.
Applications benefiting from a 14B parameter model with a large context window.
Further fine-tuning for specialized text generation or understanding tasks.

Overview

Model Overview

Key Training Details

Potential Use Cases

Full Model Card (README)