Name: 18-Death/sq-bijection-walnut53-gsm8k API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: 18-Death

Model Overview

The 18-Death/sq-bijection-walnut53-gsm8k is a 3.1 billion parameter language model, fine-tuned using the TRL library. While the specific base model is not detailed, it has been trained with Supervised Fine-Tuning (SFT) to adapt its capabilities.

Key Characteristics

Parameter Count: 3.1 billion parameters, offering a balance between performance and computational efficiency.
Context Length: Supports a substantial context window of 32768 tokens, allowing for processing longer inputs and generating more coherent, extended outputs.
Training Method: Utilizes Supervised Fine-Tuning (SFT), a common and effective method for adapting pre-trained models to specific tasks or datasets.
Frameworks: Developed using TRL (version 1.3.0), Transformers (version 5.6.2), PyTorch (version 2.10.0), Datasets (version 4.8.4), and Tokenizers (version 0.22.2).

Intended Use Cases

This model is suitable for various text generation tasks where a medium-sized model with a large context window is beneficial. Its SFT training suggests it can handle instruction-following or question-answering scenarios effectively, as demonstrated by the quick start example for generating responses to open-ended questions.

Overview

Model Overview

Key Characteristics

Intended Use Cases

Full Model Card (README)