tksoon/llama33_70bn_raft_v1

TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:8kPublished:Apr 7, 2026Architecture:Transformer Cold

The tksoon/llama33_70bn_raft_v1 is a 70 billion parameter instruction-tuned language model, fine-tuned and converted to GGUF format using Unsloth. This model is designed for general text generation and instruction following, leveraging the Llama 3.3 architecture. It offers various quantization options (Q5_K_M, Q8_0, Q4_K_M, BF16) for flexible deployment. Its primary utility lies in providing a robust, instruction-following LLM in an optimized GGUF format.

Loading preview...

Model Overview

The tksoon/llama33_70bn_raft_v1 is a 70 billion parameter instruction-tuned language model based on the Llama 3.3 architecture. It has been specifically fine-tuned and converted into the GGUF format using the Unsloth framework, which is noted for its accelerated training capabilities.

Key Features & Capabilities

  • Instruction-Tuned: Optimized for following user instructions and generating coherent, relevant text responses.
  • GGUF Format: Provided in GGUF format, making it compatible with llama.cpp and other inference engines that support this format.
  • Quantization Options: Available in multiple quantization levels, including Q5_K_M, Q8_0, Q4_K_M, and BF16, allowing users to balance performance and resource usage.
  • Ollama Support: Includes an Ollama Modelfile for straightforward deployment within the Ollama ecosystem.
  • Unsloth Optimization: Benefits from Unsloth's training optimizations, which enabled faster fine-tuning.

Use Cases

This model is suitable for a wide range of applications requiring a powerful, instruction-following language model, particularly where efficient deployment via GGUF or Ollama is desired. It can be used for tasks such as:

  • General-purpose text generation
  • Instruction-based question answering
  • Content creation and summarization
  • Chatbot development