Mihaiii/Pallas-0.5-LASER-0.5
Mihaiii/Pallas-0.5-LASER-0.5 is a 34 billion parameter language model developed by Mihaiii, featuring a LASER intervention applied to the Pallas-0.5 base model. This iteration, Pallas-0.5-LASER-0.5, demonstrates improved validation and test logloss on the causal_judgement subset of the BigBench dataset compared to its predecessors. It is specifically optimized through rank-reduction on MLP layers, enhancing performance on causal judgment tasks.
Loading preview...
Model Overview
Mihaiii/Pallas-0.5-LASER-0.5 is a 34 billion parameter language model that incorporates a LASER intervention. This model is built upon the Pallas-0.5-LASER-0.4 base and focuses on improving performance through a rank-reduction intervention on the MLP layers.
Key Characteristics
- LASER Intervention: Utilizes a LASER intervention with specific configurations (lnum: 54, lnames: mlp, rate: 8) for targeted optimization.
- Dataset Focus: The intervention was applied and evaluated using the
causal_judgementsubset of the BigBench dataset. - Performance Improvement: Demonstrates a reduction in validation and test logloss compared to previous LASER iterations (Pallas-0.5-LASER-0.1 through Pallas-0.5-LASER-0.4), indicating enhanced accuracy on causal judgment tasks.
Performance Metrics
The model shows consistent validation accuracy while significantly reducing logloss:
- Validation Accuracy: 55.263
- Validation Logloss: 1.484 (improved from 1.525 in 0.4 and 1.650 in base Pallas-0.5)
- Test Accuracy: 61.842
- Test Logloss: 1.297 (improved from 1.326 in 0.4 and 1.463 in base Pallas-0.5)
Usage Notes
To replicate the model on a single A100 GPU, users should refer to the specific branch of the LASER repository, as the original code may encounter Out-of-Memory errors for 34B models.