ghost-x/ghost-8b-beta-1608

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Aug 18, 2024License:ghost-open-llmsArchitecture:Transformer0.0K Open Weights Warm

Ghost 8B Beta by Ghost X is an 8 billion parameter large language model based on Llama 3, designed for excellent multilingual support across 16 languages and superior knowledge capabilities. It features an 8192-token context length and includes multilingual function tools support. The model aims for cost-efficiency and demonstrates strong performance in reasoning and mathematical tasks, outperforming larger models in specific benchmarks.

Loading preview...

Ghost 8B Beta: Multilingual and Cost-Efficient LLM

Ghost 8B Beta, developed by Ghost X, is an 8 billion parameter large language model built upon the Llama 3 architecture. It is engineered for strong multilingual capabilities, supporting 16 languages including English, Vietnamese, Korean, Spanish, and more. The model is available with context lengths of 8K and 128K tokens, and natively supports multilingual function tools.

Key Capabilities & Performance

  • Multilingual Proficiency: Expanded support for 16 languages, with improved math, reasoning, and instruction-following compared to previous versions.
  • Competitive Benchmarks: Outperforms Llama 3.1 8B Instruct and GPT 3.5 Turbo in AlpacaEval 2.0's length-controlled win rates. In raw win rates, it surpasses Claude 3 Opus, Claude 3 Sonnet, GPT-4, and Mistral Large.
  • Mathematical Reasoning: Achieves approximately 66.4% on GSM8K (zero-shot), outperforming xAI Grok 1, OpenAI GPT 3.5, and Mistral Mixtral 8x7B, and performing comparably to Mistral Medium.
  • Complex Task Solving: Scores 7.74 on MT Bench, placing it close to GPT 3.5 Turbo and Claude v1, and outperforming other open models like Vicuna 33B v1.3 and Llama 2 70B chat.
  • Function Tool Support: Integrated support for function calling, allowing for advanced interaction and automation.

Use Cases

  • Multilingual Applications: Ideal for applications requiring robust understanding and generation across multiple languages.
  • Cost-Effective Deployment: Optimized for efficient deployment and operation, making it suitable for businesses and startups concerned with GPU costs.
  • Reasoning and Problem Solving: Strong performance in mathematical and complex reasoning tasks makes it suitable for analytical applications.
  • Dialogue and RAG: Supports multi-turn conversations and Retrieval-Augmented Generation (RAG) through a dedicated 'refs' role for external data.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p