Delta-Vector/Austral-70B-Winton
Hugging Face
TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:32kPublished:Jun 25, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

Delta-Vector/Austral-70B-Winton is a 70 billion parameter Llama-based language model fine-tuned by Delta-Vector. This model is specifically optimized for generalist roleplay and adventure scenarios, enhancing coherency and intelligence while maintaining creative capabilities. It utilizes KTO (Kahneman-Tversky Optimization) training to refine its performance, building upon the Austral-70B-Preview base model. With a 32768 token context length, it is designed for engaging and consistent narrative generation.

Loading preview...

Overview

Delta-Vector/Austral-70B-Winton is a 70 billion parameter language model developed by Delta-Vector, building on the Austral-70B-Preview. It is a Vulpecula finetune and Llama-based architecture, specifically enhanced with KTO (Kahneman-Tversky Optimization).

Key Capabilities

  • Generalist Roleplay/Adventure: Optimized for generating engaging and coherent narratives in roleplay and adventure contexts.
  • Enhanced Coherency and Intelligence: KTO training has improved the model's logical consistency and overall intelligence, reducing common 'slops' found in other models.
  • Creative Generation: Maintains strong creative capabilities suitable for diverse storytelling.
  • Llama-3 Instruct Chat Format: Utilizes the Llama-3 Instruct chat template for structured conversations.

Training Details

The model was initially fine-tuned on Sao's Vulpecula, trained as a 16-bit R128 LoRA for 2 epochs. The Winton version then underwent KTO training for 1 epoch using a mix of instruct and writing datasets to address coherency issues. The entire training process, including the base SFT and KTO, took approximately 48 hours on 8 x A100 GPUs.

Quantization

Delta-Vector/Austral-70B-Winton is available in various quantization formats for broader compatibility:

  • GGUF: For use with LLama.cpp and its forks.
  • EXL3: For use with TabbyAPI.
Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p