JayZenith/glyph-sft-v1
JayZenith/glyph-sft-v1 is a 4 billion parameter language model, fine-tuned from Qwen3-4B-Base, specifically designed for structured task execution. It excels at generating detailed plans, to-do lists, and tool calls within a custom 'TASK trace format'. This model is optimized for agentic workflows, demonstrating significantly lower perplexity and improved format adherence compared to its base model.
Loading preview...
JayZenith/glyph-sft-v1: Agentic Task Trace Model
JayZenith/glyph-sft-v1 is a 4 billion parameter model, fine-tuned from Qwen/Qwen3-4B-Base, with a specialized focus on generating structured task traces. This model is designed to produce detailed plans, to-do items, and tool calls, marked with satisfaction markers (⊨) and response blocks, making it highly suitable for agentic applications.
Key Capabilities & Performance
This model was fine-tuned using LoRA on attention and MLP layers, with a specific focus on lm_head to improve the learning of termination tokens. It demonstrates significant improvements over its base model:
- Perplexity Reduction: Achieved a 36% lower perplexity (2.64 vs. 3.60) on a held-out test set.
- Format Adherence: In a 5-prompt generation evaluation, it produced 4/5 valid traces (compared to 0/5 for the base model), consistently ending with responses, including plans, and avoiding repetition or truncation.
- Tool Usage: Successfully used tools in all 4 instances where they were provided, a capability absent in the base model's evaluation.
Training Details
The model was trained for 3 epochs over 330 steps, with assistant-only loss masking. Approximately 11.5% of the 4.54 billion parameters were made trainable (521M). The training utilized a custom, private dataset of 1098 task traces.
Current Status
It is important to note that glyph-sft-v1 is currently an SFT starting point for an RL run and is not yet a final chat model. Its primary purpose is to serve as a robust foundation for further reinforcement learning in agentic contexts.