Mel-Iza0/Llama2-7B_ZeroShot-20K_classe_nenhuma_port

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kArchitecture:Transformer Cold

Mel-Iza0/Llama2-7B_ZeroShot-20K_classe_nenhuma_port is a 7 billion parameter Llama 2 model, fine-tuned with a ZeroShot-20K dataset. This model leverages 4-bit quantization using bitsandbytes for efficient deployment and inference. Its primary differentiation lies in its specific fine-tuning approach, making it suitable for tasks requiring zero-shot capabilities within its trained domain. The model is optimized for performance with bfloat16 compute dtype and double quantization.

Loading preview...

Model Overview

Mel-Iza0/Llama2-7B_ZeroShot-20K_classe_nenhuma_port is a 7 billion parameter language model based on the Llama 2 architecture. It has been fine-tuned using a ZeroShot-20K dataset, indicating a focus on tasks that require generalization without explicit in-context examples. The model is configured for efficient deployment and inference through 4-bit quantization.

Key Technical Details

  • Base Model: Llama 2 (7B parameters)
  • Quantization: Utilizes bitsandbytes for 4-bit quantization (bnb_4bit_quant_type: nf4, bnb_4bit_use_double_quant: True).
  • Compute Dtype: bfloat16 for computation during quantization.
  • Context Length: Supports a context length of 4096 tokens.
  • Framework: Developed with PEFT version 0.4.0.

Potential Use Cases

This model is particularly suited for applications where:

  • Resource Efficiency is Critical: The 4-bit quantization allows for reduced memory footprint and faster inference compared to full-precision models.
  • Zero-Shot Generalization is Required: Its fine-tuning on a ZeroShot-20K dataset suggests an ability to perform tasks without extensive task-specific examples.
  • Llama 2 Ecosystem Integration: Benefits from the broad compatibility and community support of the Llama 2 family.