xxxxxccc/mediaDescr_2epoch_Mistral-Nemo-Base-2407_model

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Sep 3, 2024License:apache-2.0Architecture:Transformer Open Weights Warm

The xxxxxccc/mediaDescr_2epoch_Mistral-Nemo-Base-2407_model is a 12 billion parameter Mistral-based language model developed by xxxxxccc, fine-tuned from unsloth/Mistral-Nemo-Base-2407-bnb-4bit. This model was trained 2x faster using Unsloth and Huggingface's TRL library, offering a 32768 token context length. It is optimized for efficient training and deployment of Mistral-architecture models.

Loading preview...

Model Overview

The xxxxxccc/mediaDescr_2epoch_Mistral-Nemo-Base-2407_model is a 12 billion parameter language model based on the Mistral architecture. Developed by xxxxxccc, this model was fine-tuned from unsloth/Mistral-Nemo-Base-2407-bnb-4bit.

Key Characteristics

  • Architecture: Mistral-based, leveraging the efficient design of the Mistral family.
  • Parameter Count: 12 billion parameters, balancing performance with computational efficiency.
  • Context Length: Supports a substantial context window of 32768 tokens, enabling processing of longer inputs.
  • Training Efficiency: Notably, the model was trained 2x faster using the Unsloth library in conjunction with Huggingface's TRL library, highlighting an optimization in the training process.

Use Cases

This model is particularly suitable for applications requiring a Mistral-based LLM with optimized training characteristics. Its efficient development process suggests it could be a strong candidate for:

  • Further fine-tuning on specific downstream tasks.
  • Applications where rapid iteration and deployment of Mistral-architecture models are crucial.
  • General language understanding and generation tasks benefiting from a 12B parameter model with a large context window.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p