diffusionfamily/diffullama
DiffuLLaMA is a 7 billion parameter language model developed by HKUNLP, fine-tuned from the Llama 2 architecture. This model focuses on scaling diffusion language models through adaptation from autoregressive models. It is designed for research into novel language model architectures and their training methodologies.
Loading preview...
DiffuLLaMA Model Overview
DiffuLLaMA is a 7 billion parameter language model, fine-tuned from the Llama 2 architecture. Developed by HKUNLP, this model explores the concept of scaling diffusion language models by adapting them from existing autoregressive models. The underlying research and technical details are available through the associated GitHub repository and the arXiv paper titled "Scaling Diffusion Language Models via Adaptation from Autoregressive Models" by Gong et al. (2024).
Key Characteristics
- Base Model: Fine-tuned from the Llama 2 architecture.
- Parameter Count: 7 billion parameters.
- Context Length: Supports a context length of 4096 tokens.
- Research Focus: Primarily developed for advancing diffusion language model techniques.
Potential Use Cases
- Research and Development: Ideal for researchers exploring diffusion-based language generation and adaptation strategies.
- Architectural Experimentation: Suitable for experimenting with novel language model architectures beyond traditional autoregressive designs.
- Comparative Studies: Can be used to compare the performance and characteristics of diffusion models against autoregressive counterparts.