turboderp/Cat-Llama-3-70B-instruct

Hugging Face
TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:8kPublished:May 5, 2024License:llama3Architecture:Transformer0.1K Warm

Cat-Llama-3-70B-instruct is a 70 billion parameter instruction-tuned Llama 3 model developed by turboderp, specifically fine-tuned for extreme system prompt fidelity, helpfulness, and character engagement. It excels at providing helpful information, maintaining character immersion in role-play scenarios, and adhering strictly to system instructions. The model is particularly strong in biosciences and general science contexts, offering detailed Chain of Thought responses.

Loading preview...

Overview

Cat-Llama-3-70B-instruct is a 70 billion parameter instruction-tuned Llama 3 model developed by turboderp. It addresses the need for a general-purpose fine-tune of the Llama 3 70B base model, focusing on enhanced steerability and knowledge application. The model's core differentiators include its commitment to system prompt fidelity, helpfulness across various situations, and deep character immersion for role-playing.

Key Capabilities

  • Extreme System Instruction Fidelity: Designed to respect and adhere to system prompts to a high degree.
  • Enhanced Helpfulness: Provides helpful and informative responses, particularly in biosciences and general science domains.
  • Character Immersion (Role Play): Offers maximum character engagement and immersion in given scenarios.
  • Chain of Thought (COT): Capable of generating detailed, multi-step thought processes to solve complex tasks and enrich answers.

Training Methodology

The model was trained using a meticulously filtered dataset of instruction-response pairs, with a focus on GPT-4 quality data. A GPT model trained exclusively on GPT-4 responses served as a standard for quality assessment. Data was filtered for perplexity against this standard, and a BERT model was used to classify and remove refusal-heavy entries. Further filtering ensured inclusion of longer, Chain of Thought responses and detailed system cards for each record, reflecting specific contexts and personalities. Training involved 16 A100 GPUs over 14 days for 4 epochs.

Good For

  • Applications requiring strict adherence to system instructions or character cards.
  • Role-playing scenarios demanding deep character immersion.
  • Tasks in biosciences and general science that benefit from detailed, step-by-step explanations and Chain of Thought reasoning.
  • Use cases where helpfulness and informative responses are paramount.