leejaymin/etri-ones-solar

TEXT GENERATIONConcurrency Cost:1Model Size:10.7BQuant:FP8Ctx Length:4kPublished:Mar 31, 2024License:mitArchitecture:Transformer Open Weights Cold

etri-ones-solar is a 10.7 billion parameter auto-regressive language model developed by leejaymin, fine-tuned from the SOLAR transformer architecture. This model is based on the SOLAR-10.7B-v1.0 base model and has a context length of 4096 tokens. It is fine-tuned using an open instruction dataset, making it suitable for general language generation tasks.

Loading preview...

Model Overview

etri-ones-solar is a 10.7 billion parameter auto-regressive language model developed by leejaymin. It is built upon the SOLAR transformer architecture, specifically fine-tuned from the upstage/SOLAR-10.7B-v1.0 base model. The model has a context length of 4096 tokens and has been fine-tuned using an open instruction dataset.

Key Characteristics

  • Architecture: Based on the SOLAR transformer architecture.
  • Base Model: Fine-tuned from upstage/SOLAR-10.7B-v1.0.
  • Training: Utilizes an open instruction dataset for fine-tuning.
  • Parameter Count: 10.7 billion parameters.
  • Context Length: Supports a context window of 4096 tokens.

Performance and Evaluation

The README indicates that model comparisons for various benchmarks, including Ko-ARC, Ko-HellaSwag, Ko-MMLU, Ko-TruthfulQA, Ko-CommonGen V2, and AI-Harness evaluations (Copa, HellaSwag, BoolQ, Sentineg), are "coming soon." Currently, no specific performance metrics are provided.

Usage

The model can be loaded using the Hugging Face transformers library with AutoModelForCausalLM and AutoTokenizer. An example code snippet is provided for loading the model with torch_dtype=torch.float16 and device_map='auto'.