mlabonne/llama-2-7b-miniguanaco

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kLicense:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The mlabonne/llama-2-7b-miniguanaco model is a 7 billion parameter Llama-2-7b-chat-hf variant, fine-tuned by mlabonne using QLoRA (4-bit precision) on a subset of the OpenAssistant Guanaco dataset. This model is primarily designed for educational purposes, demonstrating fine-tuning techniques on a T4 GPU. Its main use case is for learning and experimentation with LLM fine-tuning rather than high-performance inference.

Loading preview...

Miniguanaco-7b Overview

Miniguanaco-7b is a 7 billion parameter language model developed by mlabonne, based on the Llama-2-7b-chat-hf architecture. This model was fine-tuned using QLoRA (4-bit precision) on the mlabonne/guanaco-llama2-1k dataset, which is a smaller portion of the timdettmers/openassistant-guanaco dataset.

Key Characteristics

  • Base Model: Llama-2-7b-chat-hf
  • Fine-tuning Method: QLoRA (4-bit precision)
  • Training Environment: Google Colab with a T4 GPU
  • Dataset: mlabonne/guanaco-llama2-1k (subset of OpenAssistant Guanaco)

Good for

  • Educational Purposes: Primarily intended for learning and demonstrating the process of fine-tuning Llama 2 models.
  • Experimentation: Suitable for developers and researchers to experiment with QLoRA fine-tuning techniques on a smaller scale.
  • Resource-Constrained Environments: The QLoRA method allows for fine-tuning on hardware like a Google Colab T4 GPU, making it accessible for individual learning.

This model is explicitly noted as being designed for educational use rather than high-performance inference applications.