Plutus_Advanced_model Overview
The Remostart/Plutus_Advanced_model is a 1.5 billion parameter language model, building upon the robust architecture of Qwen/Qwen2.5-1.5B-Instruct. It features an exceptionally large context window of 131072 tokens, enabling it to process and understand very long sequences of text.
Key Characteristics
- Base Model: Fine-tuned from Qwen/Qwen2.5-1.5B-Instruct, inheriting its foundational capabilities.
- Parameter Count: A compact 1.5 billion parameters, offering a balance between performance and computational efficiency.
- Extended Context Length: A significant 131072-token context window, ideal for tasks requiring deep contextual understanding across extensive documents or conversations.
Training Details
The model was fine-tuned over 2 epochs with a learning rate of 1e-05, using AdamW_Torch_Fused optimizer. Training involved a total batch size of 8, with gradient accumulation steps of 8. The training procedure utilized Transformers 4.57.3 and Pytorch 2.9.0+cu126.
Potential Use Cases
Given its large context window and instruction-tuned base, this model is potentially well-suited for:
- Long-form content analysis: Summarizing, extracting information, or answering questions from very long documents.
- Complex dialogue systems: Maintaining coherence and context over extended conversations.
- Code analysis: Processing large codebases for understanding, debugging, or generation tasks.