Name: adamo1139/LWM-7B-1M-1000000ctx-AEZAKMI-3_1-1702 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: adamo1139

Model Overview

The adamo1139/LWM-7B-1M-1000000ctx-AEZAKMI-3_1-1702 is a 7 billion parameter LargeWorldModel (LWM) that has been fine-tuned to excel in long context understanding. This model leverages the AEZAKMI v3.1 dataset and was trained using QLoRA with specific parameters (lora_r 32, cosine learning rate decay from 0.00015) over several epochs.

Key Capabilities

Extended Context Window: Fine-tuned with a maximum sequence length of 4000 tokens, indicating strong capabilities for processing and understanding long inputs.
Efficient Fine-tuning: The model was fine-tuned using unsloth and FA2 on an RTX 3090 Ti, completing training in approximately 6 hours, suggesting an efficient training methodology.
Long Context Retention: The fine-tuning process was specifically aimed at preserving and enhancing the model's ability to handle and reason over extensive contextual information.

Good For

Applications requiring deep understanding of long documents or conversations.
Tasks where maintaining context over many turns or paragraphs is crucial.
Use cases benefiting from a model optimized for long-range dependencies.

Overview

Model Overview

Key Capabilities

Good For

Full Model Card (README)