UsernameJustAnother/Nemo-12B-Marlin-v5

TEXT GENERATIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Aug 5, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

UsernameJustAnother/Nemo-12B-Marlin-v5 is a 12 billion parameter language model fine-tuned from unsloth/Mistral-Nemo-Instruct-2407. Developed by UsernameJustAnother, this model is specifically optimized for roleplay (RP) tasks, having been trained on 10,801 human-generated conversational datasets. It utilizes a unique LoRA scaling factor of 2, differing from similar models, and was efficiently trained using Unsloth, making it suitable for engaging in diverse conversational scenarios.

Loading preview...

Model Overview

UsernameJustAnother/Nemo-12B-Marlin-v5 is a 12 billion parameter language model, fine-tuned by UsernameJustAnother from the unsloth/Mistral-Nemo-Instruct-2407 base model. This experimental model was developed to explore fine-tuning basics, with a particular focus on enhancing roleplay capabilities.

Key Capabilities

  • Roleplay Optimization: The model is specifically fine-tuned for roleplay (RP) using a curated dataset of 10,801 human-generated conversations in ChatML format.
  • Efficient Training: It was trained using Unsloth and Huggingface's TRL library, enabling faster training and reduced VRAM usage. Training took approximately 5 hours on a single Colab A100 GPU.
  • Custom LoRA Configuration: A key differentiator is its use of a LoRA scaling factor of 2, which was found to yield lower training loss compared to other configurations, such as the factor of 8 used in similar models like Celeste.

Training Details

The model underwent 2 epochs of training with a total of 2,700 steps, utilizing a batch size of 8 (per_device_train_batch_size = 2, gradient_accumulation_steps = 4). It employs rslora and a lora_alpha of 32, resulting in a scaling factor of 2. The training leveraged Unsloth's gradient checkpointing for memory efficiency and used a cosine_with_min_lr scheduler with a learning rate of 8e-5.

Good For

  • Roleplay Applications: Ideal for developers and users looking for a model specialized in generating human-like, engaging conversational roleplay scenarios.
  • Experimental Fine-tuning: Serves as a practical example for those interested in understanding and implementing efficient fine-tuning techniques, particularly with Unsloth.