Jenil05/Aether-1.5B-Agentic-core

TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Jun 4, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

Jenil05/Aether-1.5B-Agentic-core is a 1.54 billion parameter language model based on the Qwen-2.5 Instruct architecture, developed by Jenil05. It is specifically optimized as an edge-native core routing hub and tool-execution engine for autonomous multi-agent orchestration frameworks. This model excels at deterministic JSON-schema parsing and argument extraction, ensuring valid and clean code output for API interactions. With a 4,096 token context window, it is designed for localized, low-latency deployment on consumer hardware.

Loading preview...

Aether-1.5B-Agentic-core: An Edge-Native Agentic LLM

Aether-1.5B-Agentic-core, developed by Jenil05, is a 1.54 billion parameter language model built on the Qwen-2.5 Instruct blueprint. It is uniquely designed to function as a core routing hub and tool-execution engine for multi-agent orchestration frameworks like CrewAI, LangChain, and AutoGen. This model addresses a critical gap by providing localized, low-latency deployment capabilities without compromising structural integrity during code output.

Key Capabilities

  • Deterministic Structured Layouts: Guarantees valid, clean, and parseable JSON payload extractions, hardened against schema syntax decay.
  • Zero-Dialogue Overhead: Stripped of non-operational text, it directly targets raw arguments, reducing execution latency and token costs.
  • Strict Data-Type Preservation: Accurately correlates natural text variables into explicit system-level parameters (e.g., int, boolean, array).
  • Optimized for API Schemas: Attention heads are specifically fine-tuned for API schemas, bridging architectural gaps in code output.

Training & Performance

The model was fine-tuned using Parameter-Efficient Fine-Tuning (PEFT) with Unsloth, employing 4-bit QLoRA targeting the complete attention mechanism. It was exclusively conditioned on high-fidelity multi-turn function invocation layouts from NousResearch/hermes-function-calling-v1. The final model is merged into 16-bit float (fp16), eliminating adapter latency overhead and requiring approximately 3.5 GB of VRAM, making it highly accessible for consumer hardware.

Good For

  • Autonomous multi-agent orchestration frameworks requiring reliable tool-calling.
  • Applications needing deterministic JSON-schema parsing and argument extraction.
  • Edge deployments where low-latency and efficient resource usage are critical.
  • Scenarios where small language models typically struggle with code parameter handling and syntax integrity.