rishiraj/meow

TEXT GENERATIONConcurrency Cost:1Model Size:10.7BQuant:FP8Ctx Length:4kPublished:Dec 14, 2023License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Cold

The rishiraj/meow model is a 10.7 billion parameter language model, fine-tuned from Upstage's SOLAR-10.7B-Instruct-v1.0. This model was trained on the HuggingFaceH4/no_robots dataset. It is designed for general language tasks, leveraging its base architecture for instruction-following capabilities.

Loading preview...

Model Overview

The rishiraj/meow model is a 10.7 billion parameter language model, fine-tuned from the upstage/SOLAR-10.7B-Instruct-v1.0 base model. It was specifically trained on the HuggingFaceH4/no_robots dataset.

Training Details

The model underwent a single epoch of training with a learning rate of 2e-05, a train_batch_size of 4, and a gradient_accumulation_steps of 128, resulting in a total_train_batch_size of 512. The optimizer used was Adam with default betas and epsilon, and a cosine learning rate scheduler. During training, a validation loss of 2.3831 was observed.

Framework Versions

Training utilized:

  • Transformers 4.37.0.dev0
  • Pytorch 2.1.1+cu121
  • Datasets 2.14.6
  • Tokenizers 0.15.0
  • PEFT 0.6.1

Intended Use Cases

Given its fine-tuning on an instruction dataset, this model is suitable for general instruction-following tasks. However, specific intended uses and limitations are not detailed in the provided information.