Name: Alepach/notHumpback-M1-Rw-F-8b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Alepach

Overview

Alepach/notHumpback-M1-Rw-F-8b is an 8 billion parameter instruction-following model built upon the Llama-3.1-8B architecture. It integrates a modified version of the Humpback self-alignment pipeline, as proposed by Li et al., with an additional 'rewriting' step inspired by Nguyen et al. This model's unique approach involves a "self-rewriting" phase, performed by the seed model itself, which occurs before self-curation. This aims to improve the linguistic quality of web-sourced responses and potentially increase the diversity and quantity of high-quality training data by restructuring messy documents.

Key Capabilities

Instruction Following: Designed to accurately follow user instructions.
Self-Alignment Pipeline: Represents the first iteration of a self-alignment pipeline, trained on a combination of gold data and synthetically generated, rewritten, and curated data.
Data Enhancement: Utilizes a novel self-rewriting step to improve the linguistic quality of responses and potentially expand the usable dataset from web corpora like C4.

Training Details

The model was fine-tuned using TRL on a dataset combining samples from oasst1 and a synthetic dataset. The synthetic data was generated by applying self-augmentation, self-rewriting, and self-curation to 502k entries from the English subset of the c4 dataset.

Potential Use Cases

Instruction-following applications: Directly usable for generating responses based on given instructions.
Iterative Model Improvement: Can serve as the 'seed model' for subsequent iterations of the self-alignment pipeline, rewriting and scoring instruction-response pairs for further training.

Overview

Overview

Key Capabilities

Training Details

Potential Use Cases

Full Model Card (README)