01-ai/Yi-1.5-34B-32K

TEXT GENERATIONConcurrency Cost:2Model Size:34BQuant:FP8Ctx Length:32kPublished:May 15, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The Yi-1.5-34B-32K model by 01-ai is a 34 billion parameter large language model, part of the upgraded Yi-1.5 series. It features an extended context length of 32K tokens and has been continuously pre-trained on an additional 500 billion tokens and fine-tuned on 3 million diverse samples. This model demonstrates enhanced performance in coding, mathematics, reasoning, and instruction-following, while maintaining strong language understanding and reading comprehension capabilities.

Loading preview...

Yi-1.5-34B-32K: An Upgraded Large Language Model

The Yi-1.5-34B-32K is a 34 billion parameter model developed by 01-ai, representing an upgraded iteration of the original Yi series. This version has undergone continuous pre-training with an additional 500 billion high-quality tokens and fine-tuning on 3 million diverse samples, building upon the foundational Yi model.

Key Capabilities and Enhancements

  • Extended Context Length: This specific model variant supports a substantial context window of 32,768 tokens, enabling processing of longer inputs and generating more coherent, extended outputs.
  • Improved Performance: Compared to its predecessor, Yi-1.5 demonstrates stronger capabilities across several critical domains:
    • Coding: Enhanced proficiency in code generation and understanding.
    • Mathematics: Better performance in mathematical problem-solving.
    • Reasoning: Improved logical deduction and complex problem-solving skills.
    • Instruction-Following: More accurate and reliable adherence to given instructions.
  • Retained Strengths: The model maintains its excellent abilities in general language understanding, commonsense reasoning, and reading comprehension.

Benchmarking Highlights

  • The Yi-1.5-34B base model, which this 32K context variant is built upon, is noted to perform comparably to or even surpass larger models in various benchmarks.

Ideal Use Cases

  • Complex Problem Solving: Suitable for tasks requiring deep reasoning and mathematical understanding.
  • Code Generation and Analysis: Beneficial for developers needing assistance with programming tasks.
  • Long-form Content Processing: Its 32K context window makes it ideal for summarizing lengthy documents, analyzing extensive codebases, or engaging in prolonged conversational interactions.
  • Instruction-Driven Applications: Effective in scenarios where precise instruction following is crucial.