AdamGrzesik/Samantha-PL-AG-Mistral-7B-v0.2

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kTool Calling:SupportedPublished:Mar 29, 2024Architecture:Transformer0.0K Cold

AdamGrzesik/Samantha-PL-AG-Mistral-7B-v0.2 is a 7 billion parameter Mistral-based causal language model fine-tuned by AdamGrzesik. This model is specifically optimized for Polish language tasks, having been trained on the Samantha-PL-AG-axolotl dataset. It leverages a 4096-token context length and is designed for applications requiring strong performance in Polish text generation and understanding.

Loading preview...

Model Overview

AdamGrzesik/Samantha-PL-AG-Mistral-7B-v0.2 is a 7 billion parameter language model built upon the Mistral-7B-v0.2 architecture. This model has been fine-tuned by AdamGrzesik using the Axolotl framework, specifically targeting Polish language capabilities.

Key Capabilities

  • Polish Language Optimization: The model is fine-tuned on the Samantha-PL-AG-axolotl dataset, indicating a specialization in Polish text generation and comprehension.
  • Mistral-7B-v0.2 Base: Benefits from the strong foundational capabilities of the Mistral architecture.
  • Context Length: Supports a sequence length of 4096 tokens, allowing for processing moderately long inputs.
  • Training Details: Trained with a learning rate of 5e-06 over 4 epochs, utilizing a total batch size of 48 and employing techniques like gradient accumulation and flash attention for efficiency.

Good For

  • Applications requiring a robust language model for Polish text.
  • Tasks involving Polish content generation, summarization, or question answering.
  • Developers looking for a Mistral-based model with enhanced performance in the Polish language.