ozone-research/Chirp-01
Chirp-01 by Ozone Research is a 3.1 billion parameter language model, fine-tuned from Qwen2.5 3B Instruct with 50 million tokens of distilled GPT-4o data. This compact model demonstrates strong performance on benchmarks like MMLU Pro and IFEval, showing significant improvement over its base model. It is designed to push the capabilities of small-scale LLMs, making it suitable for research and applications requiring efficient yet powerful language understanding.
Loading preview...
Overview
Chirp-01 is a 3.1 billion parameter language model developed by the Ozone Research team. It is fine-tuned from the Qwen2.5 3B Instruct base model, utilizing 50 million tokens of distilled data from GPT-4o. This training approach enables Chirp-01 to achieve strong performance for its size, making it a notable contribution to the field of compact, high-performing LLMs.
Key Capabilities & Performance
- Parameter Count: 3.1 billion parameters, offering a balance of capability and efficiency.
- Training Data: Leverages 50 million tokens distilled from GPT-4o, a method aimed at enhancing instruction following and general intelligence.
- Benchmark Performance:
- MMLU Pro: Achieves an overall average accuracy of 0.4320, representing a 9-point improvement over its base model.
- IFEval: Scores 72%, demonstrating a 14% improvement compared to the base model.
Use Cases
Chirp-01 is an open-source model designed to explore the limits of small-scale LLMs. Its strong benchmark performance for a 3B model suggests its utility for:
- Researchers and developers interested in efficient language models.
- Applications where computational resources are constrained but robust language understanding is required.
- Further experimentation and development in the domain of compact AI models.