arcee-ai/Arcee-Spark-FP32

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Jun 21, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

Arcee Spark is a 7.6 billion parameter language model developed by Arcee.ai, initialized from Qwen2 and further refined with Direct Preference Optimization (DPO). This model achieves state-of-the-art performance for its size, notably scoring highest on MT-Bench in the 7B class and outperforming GPT-3.5 on many tasks. It is optimized for real-time applications, edge computing, and cost-effective scaling, offering fast inference and deep reasoning capabilities for tasks like text generation, question answering, and code analysis.

Loading preview...

Arcee Spark: High-Performance 7B Language Model

Arcee Spark is a powerful 7.6 billion parameter language model from Arcee.ai, built upon the Qwen2 architecture. It underwent a sophisticated training regimen involving fine-tuning on 1.8 million samples, merging with Qwen2-7B-Instruct using Arcee's mergekit, and subsequent refinement via Direct Preference Optimization (DPO).

Key Capabilities & Performance

  • Exceptional Performance for Size: Achieves the highest MT-Bench score in the 7B parameter class (8.469 average), outperforming even GPT-3.5 on numerous tasks.
  • Advanced Training: Leverages fine-tuning, model merging, and DPO for superior results.
  • Efficiency: Offers significantly faster inference times (10-100x faster than larger models) and lower computational requirements.
  • Reasoning: Provides deep reasoning capabilities suitable for complex tasks.
  • Versatile Applications: Excels in advanced text generation, detailed question answering, nuanced sentiment analysis, complex problem-solving, and code generation/analysis.

Ideal Use Cases

Arcee Spark is particularly well-suited for scenarios demanding high performance within resource constraints:

  • Real-time Applications: Chatbots, customer service automation, and interactive systems requiring low latency.
  • Edge Computing: Deploying sophisticated AI tasks on edge devices or in environments with limited resources.
  • Cost-Effective Scaling: Implementing advanced language AI across organizations without extensive infrastructure or API costs.
  • Rapid Prototyping: Quickly developing and iterating on AI-powered features and products.
  • On-premise Deployment: Hosting on local infrastructure for enhanced data privacy and security.