Name: Aratako/Llama-Gemma-2-27b-ORPO-iter3 API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: Aratako

Overview

Aratako/Llama-Gemma-2-27b-ORPO-iter3 is a 27 billion parameter instruction-tuned model developed by Aratako. It is based on the google/gemma-2-27b architecture and incorporates elements from Llama and Qwen. This model underwent a multi-stage fine-tuning process, starting with supervised instruction tuning and two iterations of CPO_SimPO, followed by an application of ORPO (Optimized Reward Policy Optimization).

Key Capabilities

Instruction Following: Enhanced through ORPO fine-tuning, making it suitable for various instruction-based tasks.
Iterative Refinement: Benefits from an iterative training approach, building upon Aratako/Llama-Gemma-2-27b-CPO_SimPO-iter2.
Training Methodology: Utilizes axolotl for training, with specific configurations for ORPO, including orpo_alpha: 0.1 and a learning_rate: 8e-7.

Training Details

The model was trained using the Aratako/iterative-dpo-data-for-ORPO-iter3 dataset. The training process involved a max_prompt_len of 512 and a max_length of 2560, with a sequence_len of 2560. It was developed as part of a competition for the Matsuo Lab Large Language Model Course 2024.

Licensing

The model's usage is subject to several licenses due to its base models and training data:

META LLAMA 3.1 COMMUNITY LICENSE
Gemma Terms of Use
Qwen LICENSE AGREEMENT (requires attribution like "Built with Qwen")