Vedika35/VEDIKA-3.5-LIVE
VEDIKA-3.5-LIVE is an instruction-tuned 3.09 billion parameter causal language model from the Qwen2.5 series, developed by Qwen. It features a 32,768 token context length and is designed with improved capabilities in coding, mathematics, and instruction following. This model excels at generating long texts, understanding structured data, and producing structured outputs like JSON, with multilingual support for over 29 languages.
Loading preview...
VEDIKA-3.5-LIVE: Qwen2.5 Series Instruction Model
VEDIKA-3.5-LIVE is a 3.09 billion parameter instruction-tuned model from the Qwen2.5 series, developed by Qwen. It builds upon the Qwen2 architecture, incorporating transformers with RoPE, SwiGLU, and RMSNorm. This model offers a substantial 32,768 token context length for input and can generate up to 8,192 tokens.
Key Capabilities
- Enhanced Knowledge & Reasoning: Significantly improved performance in coding and mathematics due to specialized expert models.
- Superior Instruction Following: Better at adhering to instructions, generating long texts (over 8K tokens), and understanding/generating structured data, including JSON.
- Robust Chatbot Implementation: More resilient to diverse system prompts, enhancing role-play and condition-setting for conversational agents.
- Multilingual Support: Supports over 29 languages, including major global languages like Chinese, English, French, Spanish, German, and Japanese.
Good For
- Applications requiring strong coding and mathematical reasoning at a 3B parameter scale.
- Chatbots and conversational AI demanding precise instruction following and structured output generation.
- Tasks involving long-form text generation and processing structured data.
- Multilingual applications needing broad language coverage.