pankajmathur/orca_mini_v9_3_70B

Hugging Face
TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:32kLicense:llama3.3Architecture:Transformer0.0K Warm

The pankajmathur/orca_mini_v9_3_70B model is a 70 billion parameter instruction-tuned language model, fine-tuned by pankajmathur using various Supervised Fine-Tuning (SFT) datasets on the Llama-3.3-70B-Instruct base architecture. This model is designed as a comprehensive general-purpose model, offering a 32768 token context length. It is intended to serve as a foundational base for further customization, including full fine-tuning, DPO, PPO, ORPO tuning, or model merges, encouraging innovation and specific enhancements by developers.

Loading preview...

Model Overview

The pankajmathur/orca_mini_v9_3_70B is a 70 billion parameter instruction-tuned model developed by pankajmathur. It is built upon the Llama-3.3-70B-Instruct base model and has been fine-tuned using a variety of Supervised Fine-Tuning (SFT) datasets. This model is designed to be a comprehensive general-purpose AI assistant, offering a substantial 32768 token context window.

Key Characteristics

  • Base Model: Utilizes the robust Llama-3.3-70B-Instruct architecture.
  • Parameter Count: Features 70 billion parameters, providing strong generative capabilities.
  • Context Length: Supports a 32768 token context window, enabling processing of longer inputs and generating more coherent, extended responses.
  • Customization Ready: Explicitly designed as a foundational model, encouraging developers to perform further fine-tuning (DPO, PPO, ORPO), or model merges to tailor it for specific applications.

Usage and Deployment

The model supports standard text generation pipelines and can be deployed in various precision formats to manage VRAM requirements. It requires approximately 133GB VRAM in default half precision (bfloat16), but can be run with significantly less VRAM using 4-bit (39GB) or 8-bit (69GB) quantization via the bitsandbytes library. The model adheres to the Llama 3 prompt format, ensuring compatibility with existing Llama 3-based workflows.

Responsible AI Considerations

As the base model is Llama 3.3, it incorporates Meta's responsible AI practices, including safety fine-tuning, data quality control, and mitigation of critical risks such as CBRNE, child safety, and cyber attack enablement. Developers are encouraged to implement additional safeguards and refer to Meta's Responsible Use Guide for safe deployment.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p