Name: fangloveskari/ORCA_LLaMA_70B_QLoRA API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: fangloveskari

Model Overview

The fangloveskari/ORCA_LLaMA_70B_QLoRA is a 69 billion parameter language model built upon the LLaMA-2 architecture. It has been fine-tuned by fangloveskari using the QLoRA (NF4) method, specifically with a LoRA rank and alpha of 16. The training utilized a diverse dataset blend, primarily Open-Platypus, augmented with smaller proportions of Dolphin (GPT-4) and OpenOrca (GPT-4) datasets.

Key Capabilities & Performance

This model demonstrates strong performance across various benchmarks, indicating its proficiency in general reasoning and instruction following:

ARC (25-shot): 72.27
HellaSwag (10-shot): 87.74
MMLU (5-shot): 70.23
TruthfulQA (0-shot): 63.37

It achieves an average score of 73.40 across these metrics. The model supports a substantial context length of 32768 tokens, enhancing its ability to process and understand longer inputs.

Training Details

The fine-tuning process was conducted using the LLaMA-Efficient-Tuning framework, incorporating flash_attention_2 and DeepSpeed ZERO-2. The model was trained for 1 epoch with a batch size of 14 on 8xA100 (80G) GPUs. Two methods were explored for fusing the adapter back to the base model, with one showing slight improvements in ARC and TruthfulQA scores.

Use Cases

Given its strong benchmark performance and extensive context window, this model is well-suited for:

General instruction-following tasks.
Applications requiring robust reasoning capabilities.
Scenarios benefiting from a large context window for complex queries or document analysis.

Users should be aware of the Llama-2 license restrictions and perform safety testing, as with all LLMs, due to potential biases or inaccuracies.

Overview

Model Overview

Key Capabilities & Performance

Training Details

Use Cases

Full Model Card (README)