koutch/short_paper_llama_1.json_train_dpo_v4_train_no_think
The koutch/short_paper_llama_1.json_train_dpo_v4_train_no_think is an 8 billion parameter Llama 3.1 instruction-tuned causal language model, developed by koutch. This model was fine-tuned using Unsloth and Huggingface's TRL library, enabling faster training. It is designed for general language generation tasks, leveraging its Llama 3.1 architecture and 32768 token context length.
Loading preview...
Overview
This model, developed by koutch, is an 8 billion parameter Llama 3.1 instruction-tuned causal language model. It was fine-tuned from unsloth/meta-llama-3.1-8b-instruct-bnb-4bit.
Key Characteristics
- Architecture: Based on the Llama 3.1 family.
- Parameter Count: 8 billion parameters.
- Training Efficiency: Fine-tuned using Unsloth and Huggingface's TRL library, which facilitates faster training processes.
- Context Length: Supports a context length of 32768 tokens.
Use Cases
This model is suitable for various general-purpose language generation and instruction-following tasks, benefiting from its Llama 3.1 foundation and efficient fine-tuning.