migtissera/Tess-2.0-Llama-3-70B

Warm
Public
70B
FP8
8192
License: llama3
Hugging Face
Overview

Tess-2.0-Llama-3-70B Overview

Tess-2.0-Llama-3-70B, named after "Tesoro" (Treasure), is a 70 billion parameter general-purpose large language model developed by migtissera. It is built upon the robust meta-llama/Meta-Llama-3-70B base model, leveraging its strong foundational capabilities.

Key Characteristics

  • Base Model: Fine-tuned from meta-llama/Meta-Llama-3-70B.
  • Training Methodology: Utilizes LIMA (Less-Is-More) principles, focusing on a curated dataset for efficient learning.
  • Dataset: Trained on the Tess-2.0 dataset, comprising approximately 100,000 high-quality code and general training samples.
  • Instruction Following: Designed to be highly uncensored and consistently follow instructions, making it versatile for various applications.
  • Training Depth: Fine-tuned for only two epochs with a low learning rate to preserve the base model's entropy.
  • Prompt Format: Employs the Llama-3 prompt format for optimal interaction.

Intended Use Cases

This model is well-suited for applications requiring a powerful, general-purpose language model that adheres closely to user instructions. Its uncensored nature means it will attempt to fulfill requests without refusal, making it a strong candidate for:

  • General conversational AI.
  • Content generation across diverse topics.
  • Code generation and assistance.
  • Applications where consistent instruction following is paramount.

Limitations

As an uncensored model, Tess-2.0-Llama-3-70B may generate inappropriate, biased, or offensive content. Users should exercise caution and verify information, as the model can occasionally produce inaccurate or misleading results.