AdamGrzesik/Samantha-PL-AG-Mistral-7B-v0.2
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kTool Calling:SupportedPublished:Mar 29, 2024Architecture:Transformer0.0K Cold
AdamGrzesik/Samantha-PL-AG-Mistral-7B-v0.2 is a 7 billion parameter Mistral-based causal language model fine-tuned by AdamGrzesik. This model is specifically optimized for Polish language tasks, having been trained on the Samantha-PL-AG-axolotl dataset. It leverages a 4096-token context length and is designed for applications requiring strong performance in Polish text generation and understanding.
Loading preview...
Model Overview
AdamGrzesik/Samantha-PL-AG-Mistral-7B-v0.2 is a 7 billion parameter language model built upon the Mistral-7B-v0.2 architecture. This model has been fine-tuned by AdamGrzesik using the Axolotl framework, specifically targeting Polish language capabilities.
Key Capabilities
- Polish Language Optimization: The model is fine-tuned on the
Samantha-PL-AG-axolotldataset, indicating a specialization in Polish text generation and comprehension. - Mistral-7B-v0.2 Base: Benefits from the strong foundational capabilities of the Mistral architecture.
- Context Length: Supports a sequence length of 4096 tokens, allowing for processing moderately long inputs.
- Training Details: Trained with a learning rate of 5e-06 over 4 epochs, utilizing a total batch size of 48 and employing techniques like gradient accumulation and flash attention for efficiency.
Good For
- Applications requiring a robust language model for Polish text.
- Tasks involving Polish content generation, summarization, or question answering.
- Developers looking for a Mistral-based model with enhanced performance in the Polish language.