kosiasuzu/agenticml-agent-llama-3.1-8b-init

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kTool Calling:SupportedPublished:May 16, 2026License:llama3.1Architecture:Transformer Cold

The kosiasuzu/agenticml-agent-llama-3.1-8b-init is an 8 billion parameter Llama-3.1 base model, developed by kosiasuzu, with its 8192-token context window. It features eleven reserved special tokens initialized with semantically related content-token embeddings, specifically prepared for fine-tuning on the Telos agent trajectory format. This model is designed as a foundational checkpoint for developing agentic LLMs, providing a meaningful starting representation for Telos-specific markers.

Loading preview...

Overview

This model, kosiasuzu/agenticml-agent-llama-3.1-8b-init, is an 8 billion parameter Llama-3.1 base model. Its primary distinction is the in-place initialization of eleven reserved special tokens within its embed_tokens and lm_head matrices. These tokens, such as <|goal|>, <|mission|>, and <|action|>, are seeded with embeddings derived from 2-3 semantically related content tokens. This initialization addresses a limitation in the vanilla Llama-3.1-8B base model where these reserved tokens had all-zero embeddings, rendering them effectively invisible to the model during input and ungeneratable as output.

Key Characteristics

  • Base Model: meta-llama/Llama-3.1-8B.
  • Token Initialization: Eleven Telos-specific reserved tokens (e.g., <|goal|>, <|action|>) have non-zero embeddings, allowing them to contribute meaningful signal.
  • Purpose: Serves as a starting point for fine-tuning on Telos-formatted agent trajectories.

Intended Use

This checkpoint is specifically designed to be the base for fine-tuning agent models using the Telos format. It ensures that the model can properly process and eventually generate the specialized Telos markers after fine-tuning. It should be used in the same manner as the plain Llama-3.1-8B base model for subsequent training.

Important Considerations

  • Not a fine-tuned agent model: This model is not an instruction-tuned agent and will not follow the Telos format or instructions correctly out-of-the-box.
  • Base Model Behavior: Its behavior on general tasks is identical or near-identical to the vanilla Llama-3.1-8B base model, inheriting all its limitations and biases.
  • License: Inherits the Llama 3.1 Community License.