myra/negation_llama_chat
The myra/negation_llama_chat model is a 7 billion parameter language model, fine-tuned from Meta's Llama-2-7b-chat-hf. It features a 4096 token context length and was trained with a learning rate of 2e-05 over 3 epochs. This model is a specialized iteration of the Llama-2-7b-chat architecture, though its specific differentiation and intended use beyond general chat capabilities are not detailed.
Loading preview...
Overview
The myra/negation_llama_chat model is a fine-tuned variant of the Meta Llama-2-7b-chat-hf architecture. This 7 billion parameter model is designed for chat-based applications, leveraging the foundational capabilities of the Llama 2 series. It was trained using a learning rate of 2e-05, a total batch size of 32, and an Adam optimizer with cosine learning rate scheduling over 3 epochs.
Key Characteristics
- Base Model: Fine-tuned from
meta-llama/Llama-2-7b-chat-hf. - Parameters: 7 billion parameters.
- Context Length: Supports a context window of 4096 tokens.
- Training Details: Utilized a multi-GPU setup (4 devices) with a learning rate of 2e-05, Adam optimizer, and a cosine learning rate scheduler.
Limitations and Further Information
Specific details regarding the dataset used for fine-tuning, the model's intended uses, and its unique differentiators or performance benchmarks are not provided in the available documentation. Users should be aware that without this information, the model's specific strengths, weaknesses, and optimal use cases beyond general chat are undefined. Further information is needed to fully assess its capabilities and limitations.