stech2333/brainalign-qwen2.5-1.5b-C
The stech2333/brainalign-qwen2.5-1.5b-C model is a 1.5 billion parameter Qwen2.5-Instruct architecture, specifically a merged LoRA checkpoint from the BrainAlign project. This model is designed for research and evaluation of the BrainAlign stage-2 fine-tuning branch 'C', offering a full-weights Hugging Face Transformers checkpoint. It is optimized for leaderboard-style offline evaluation and provides a base for further research into fine-tuned causal language models.
Loading preview...
Model Overview
stech2333/brainalign-qwen2.5-1.5b-C is a 1.5 billion parameter causal language model based on the Qwen/Qwen2.5-1.5B-Instruct architecture. This model represents the C branch of the BrainAlign project's stage-2 LoRA fine-tuning, specifically the best_by_retrieval checkpoint, which has been merged into the full base model. It is provided as a standard Hugging Face Transformers checkpoint, making it suitable for direct use and evaluation.
Key Capabilities & Features
- Architecture: Built upon the robust
Qwen2ForCausalLMarchitecture. - Format: A merged full-weights Hugging Face Transformers checkpoint, ready for immediate loading and inference.
- Precision: Optimized for
bfloat16precision, suitable for leaderboard submissions and efficient evaluation. - Research Focus: Primarily intended for research and evaluation of the BrainAlign stage-2 fine-tuning methodology.
Intended Use Cases
This model is specifically designed for:
- BrainAlign Project Evaluation: Assessing the performance of the
Cbranch within the BrainAlign stage-2 fine-tuning. - Leaderboard Submission: Packaged in a format suitable for Open LLM Leaderboard v2 evaluation, requiring a full model repository.
- Offline Evaluation: Ideal for scenarios where a complete model checkpoint is needed for thorough offline testing and analysis.
Users should note that this repository focuses on packaging and export details for evaluation and does not claim additional safety alignment or benchmark superiority beyond the fine-tuning performed within the BrainAlign project. Downstream behavior should be validated for specific target tasks.