ferrazzipietro/unsup-Llama-3.1-8B-Instruct-datav2
The ferrazzipietro/unsup-Llama-3.1-8B-Instruct-datav2 is an 8 billion parameter instruction-tuned causal language model, fine-tuned by ferrazzipietro from Meta Llama 3.1. This model is based on the Llama 3.1 architecture and features a 32768 token context length. It is designed for general instruction-following tasks, though specific differentiators and training data details are not publicly available.
Loading preview...
Model Overview
The ferrazzipietro/unsup-Llama-3.1-8B-Instruct-datav2 is an 8 billion parameter instruction-tuned model, fine-tuned by ferrazzipietro. It is built upon the meta-llama/Llama-3.1-8B-Instruct base model, leveraging its Llama 3.1 architecture and a 32768 token context length.
Key Characteristics
- Base Model: Fine-tuned from Meta Llama 3.1-8B-Instruct.
- Parameter Count: 8 billion parameters.
- Context Length: Supports a context window of 32768 tokens.
- Training: The model underwent 1 epoch of training with a learning rate of 4e-05, using AdamW_Torch optimizer and a cosine learning rate scheduler with 0.1 warmup ratio. Training was distributed across 2 GPUs with a total batch size of 256.
Current Limitations
Specific details regarding the fine-tuning dataset, intended uses, and detailed limitations are not provided in the available documentation. Users should exercise caution and conduct thorough evaluations for specific applications, as the model's unique differentiators and performance characteristics beyond its base model are not explicitly stated.