timpal0l/Mistral-7B-v0.1-flashback-v2

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Dec 4, 2023License:mitArchitecture:Transformer0.0K Open Weights Warm

timpal0l/Mistral-7B-v0.1-flashback-v2 is a 7 billion parameter causal language model based on the Mistral-7B-v0.1 architecture. It has undergone continued pretraining on 2.25 million forum threads (approximately 40GB of text) from the Swedish website flashback.org. This model is specifically fine-tuned for one epoch on this Swedish forum data, making it particularly adept at generating text in the style and context of Swedish online discussions.

Loading preview...

Model Overview

timpal0l/Mistral-7B-v0.1-flashback-v2 is a 7 billion parameter language model built upon the Mistral-7B-v0.1 base architecture. Its primary distinction lies in its continued pretraining on a substantial dataset of 2,251,233 forum threads from the Swedish website flashback.org, totaling approximately 40GB of text. This process involved a full fine-tuning for one epoch on this specific dataset.

Key Characteristics

  • Domain-Specific Training: The model is uniquely trained on Swedish forum data, making it highly specialized for generating text in that context and style.
  • Mistral-7B-v0.1 Base: Leverages the efficient and capable Mistral-7B-v0.1 architecture.
  • Context Length: Supports an 8192-token context window.
  • Data Format Mimicry: The training data followed a specific structure including thread titles, usernames, and quoted responses, which the model is designed to replicate.

Performance

Evaluations on the Open LLM Leaderboard show an average score of 57.53. Specific metric scores include:

  • AI2 Reasoning Challenge (25-Shot): 57.17
  • HellaSwag (10-Shot): 80.74
  • MMLU (5-Shot): 59.98
  • TruthfulQA (0-shot): 40.66
  • Winogrande (5-shot): 77.19
  • GSM8k (5-shot): 29.42

Use Cases

This model is particularly well-suited for applications requiring generation or understanding of text in the style of Swedish online forums, such as:

  • Content generation for Swedish-language discussion platforms.
  • Research into online Swedish discourse.
  • Developing chatbots or virtual assistants with a specific Swedish forum-like persona.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p