CharlesLi/llama_2_cot_simplest_alpaca_5_full

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 20, 2025License:llama2Architecture:Transformer Open Weights Cold

The CharlesLi/llama_2_cot_simplest_alpaca_5_full is a 7 billion parameter language model, fine-tuned from Meta's Llama-2-7b-chat-hf. This model is specifically adapted for conversational AI tasks, leveraging its base architecture for generating human-like text responses. It is designed for applications requiring instruction-following capabilities, building upon the Llama 2 foundation with a 4096-token context length.

Loading preview...

Model Overview

This model, llama_2_cot_simplest_alpaca_5_full, is a 7 billion parameter language model derived from Meta's Llama-2-7b-chat-hf. It has undergone fine-tuning on a specific generator dataset, aiming to enhance its performance in conversational and instruction-following scenarios.

Key Characteristics

  • Base Model: Fine-tuned from meta-llama/Llama-2-7b-chat-hf.
  • Parameter Count: 7 billion parameters.
  • Context Length: Supports a context window of 4096 tokens.
  • Training Objective: Optimized through fine-tuning on a generator dataset, indicated by an evaluation loss of 0.7572.

Training Details

The model was trained with a learning rate of 2e-05, a batch size of 4 (total train batch size of 32 across 4 GPUs), and utilized the Adam optimizer. The training process involved 1 epoch with a cosine learning rate scheduler.

Potential Use Cases

  • Conversational AI: Suitable for chatbots and interactive agents.
  • Instruction Following: Can be applied to tasks requiring the model to adhere to specific instructions or prompts.
  • Text Generation: Capable of generating coherent and contextually relevant text based on input.