pankajmathur/orca_mini_v8_1_70b
Hugging Face
TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:32kLicense:llama3.3Architecture:Transformer0.0K Warm

The pankajmathur/orca_mini_v8_1_70b is a 70 billion parameter instruction-tuned causal language model, fine-tuned by pankajmathur on the Llama-3.3-70B-Instruct base model. This model is designed as a comprehensive general model, trained with various Supervised Fine-Tuning (SFT) datasets. It supports advanced features like tool use and is intended as a foundational base for further fine-tuning, DPO, PPO, or ORPO tuning, and model merges.

Loading preview...

Model Overview

pankajmathur/orca_mini_v8_1_70b is a 70 billion parameter instruction-tuned model built upon the Llama-3.3-70B-Instruct base. It has been trained using various Supervised Fine-Tuning (SFT) datasets, aiming to be a comprehensive general-purpose model.

Key Capabilities

  • Instruction Following: Designed to respond effectively to a wide range of instructions due to SFT training.
  • Tool Use: Supports advanced tool use formats, compatible with Llama 3.3's capabilities and Transformers library's chat templating for function calling.
  • Quantization Support: Can be run in various quantization formats (4-bit, 8-bit) to reduce VRAM requirements, making it accessible for different hardware setups.
  • Foundation Model: Explicitly intended as a base for further fine-tuning, including DPO, PPO, ORPO tuning, and model merging, encouraging community customization and enhancement.

Responsible AI & Safety

Meta, the developer of the base Llama 3.3 model, emphasizes a three-pronged strategy for trust and safety, focusing on enabling developers, protecting against adversarial use, and providing community safeguards. This includes extensive safety fine-tuning, data collection with LLM-based classifiers, and addressing refusal tone. The model is not designed for isolated deployment and requires system-level safeguards, with resources like Llama Guard 3, Prompt Guard, and Code Shield available. Evaluations include red teaming and specific focus on critical risks like CBRNE, Child Safety, and Cyber attack enablement.

When to Use This Model

  • As a Base Model: Ideal for developers looking for a strong 70B foundation to fine-tune for specific domain applications or tasks.
  • Instruction-Following Tasks: Suitable for general instruction-based conversational AI or task execution.
  • Tool-Augmented Applications: Excellent for building applications that require function calling or integration with external tools.
  • Resource-Constrained Environments: Usable with 4-bit or 8-bit quantization for reduced VRAM footprint (approx. 39GB for 4-bit).
Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p