Model Overview
etri-ones-solar is a 10.7 billion parameter auto-regressive language model developed by leejaymin. It is built upon the SOLAR transformer architecture, specifically fine-tuned from the upstage/SOLAR-10.7B-v1.0 base model. The model has a context length of 4096 tokens and has been fine-tuned using an open instruction dataset.
Key Characteristics
- Architecture: Based on the SOLAR transformer architecture.
- Base Model: Fine-tuned from
upstage/SOLAR-10.7B-v1.0. - Training: Utilizes an open instruction dataset for fine-tuning.
- Parameter Count: 10.7 billion parameters.
- Context Length: Supports a context window of 4096 tokens.
Performance and Evaluation
The README indicates that model comparisons for various benchmarks, including Ko-ARC, Ko-HellaSwag, Ko-MMLU, Ko-TruthfulQA, Ko-CommonGen V2, and AI-Harness evaluations (Copa, HellaSwag, BoolQ, Sentineg), are "coming soon." Currently, no specific performance metrics are provided.
Usage
The model can be loaded using the Hugging Face transformers library with AutoModelForCausalLM and AutoTokenizer. An example code snippet is provided for loading the model with torch_dtype=torch.float16 and device_map='auto'.