Name: CoolSpring/Qwen2-0.5B-Abyme API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: CoolSpring

What is CoolSpring/Qwen2-0.5B-Abyme?

CoolSpring/Qwen2-0.5B-Abyme is a 0.5 billion parameter language model, part of the Qwen2 series, fine-tuned by CoolSpring. Its core purpose is to investigate the effects of training a smaller model on data extracted from a much larger model, specifically the Qwen2-72B. This experiment aims to determine if knowledge and capabilities can be effectively transferred or distilled from a powerful large language model to a significantly smaller one through fine-tuning.

Key Characteristics & Training:

Base Model: Fine-tuned from Qwen/Qwen2-0.5B.
Dataset: Trained on the Magpie-Align/Magpie-Qwen2-Pro-300K-Filtered dataset, comprising 300,000 conversation samples generated by the Qwen2-72B model.
Context Length: Supports a sequence length of 4096 tokens during training, with a reported context length of 131072 tokens.
Training Objective: To explore knowledge transfer and distillation from a 72B parameter model to a 0.5B parameter model.

Intended Use Cases:

Research: Primarily for studying knowledge transfer, model distillation, and the ability of smaller models to learn from larger ones.
Resource-Constrained Environments: Potentially applicable where computational resources for large language models are limited, and a smaller, fine-tuned model could offer comparable performance for specific tasks.

Limitations:

The model's full capabilities and limitations are still under evaluation.
Performance may vary significantly across different tasks and domains.
It may inherit biases or limitations from its base model or the training data.

Overview

What is CoolSpring/Qwen2-0.5B-Abyme?

Key Characteristics & Training:

Intended Use Cases:

Limitations:

Full Model Card (README)