migtissera/Synthia-70B-v1.2b

TEXT GENERATIONConcurrency Cost:4Model Size:69BQuant:FP8Ctx Length:32kPublished:Sep 10, 2023License:llama2Architecture:Transformer0.0K Open Weights Cold

Synthia-70B-v1.2b by migtissera is a 69 billion parameter Llama-2-70B based causal language model, fine-tuned on Orca-style datasets for enhanced instruction following and long-form conversational abilities. It features a 32K context length and is designed to facilitate Tree of Thought and Chain of Thought reasoning. The model achieves an average score of 70.71 on the HuggingFaceH4 Open LLM Leaderboard benchmarks, demonstrating strong performance across various tasks.

Loading preview...

Synthia-70B-v1.2b: An Instruction-Following Conversational LLM

migtissera's Synthia-70B-v1.2b is a 69 billion parameter language model built upon the Llama-2-70B architecture. It has been extensively fine-tuned using Orca-style datasets, focusing on improving its instruction-following capabilities and its proficiency in engaging in long-form conversations. This version, 1.2b, received additional training data and 14 days of further training for one epoch compared to its predecessor.

Key Capabilities & Features

  • Instruction Following: Optimized for precise adherence to user instructions.
  • Long-Form Conversations: Excels at maintaining coherent and extended dialogues.
  • Reasoning Enhancement: Can be prompted with a specific system message to evoke Tree of Thought and Chain of Thought reasoning, aiding in constructing clear and cohesive responses.
  • Uncensored Nature: The model is uncensored, offering broad applicability but requiring cautious use.
  • Llama-2 Base: Inherits the robust foundation of the Llama-2 model family.

Performance Highlights

Evaluated on the EleutherAI Language Model Evaluation Harness, Synthia-70B-v1.2b demonstrates competitive performance on the HuggingFaceH4 Open LLM Leaderboard benchmarks:

  • Overall Average: 70.71
  • ARC (25-shot): 68.77
  • HellaSwag (10-shot): 87.57
  • MMLU (5-shot): 68.81
  • TruthfulQA (0-shot): 57.69

Good For

  • Applications requiring advanced instruction following.
  • Developing conversational AI agents that need to maintain extended interactions.
  • Use cases where explicit reasoning processes (like Tree of Thought) are beneficial.
  • Research and development in large language models, particularly those based on Llama-2.