typhoon-ai/llama3.2-typhoon2-t1-3b-research-preview
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Dec 27, 2024License:llama3.2Architecture:Transformer0.0K Warm

Typhoon T1 3B (Research Preview) is the first model in the new Typhoon T family of open reasoning models developed by SCB 10X, built upon the Llama 3.2 architecture. This 3 billion parameter instruct model is designed to "think longer" before generating a final answer, offering improved performance on challenging benchmarks like GPQA, MMLU Pro, and AI Mathematics Olympiad validation set. It excels at reasoning across diverse domains, unlike many open reasoning models limited to mathematics and coding, and uniquely supports generating Thai reasoning traces for enhanced transparency.

Loading preview...

What the fuck is this model about?

Typhoon T1 3B (Research Preview) is the inaugural model in the Typhoon T family, a new line of open reasoning models developed by SCB 10X. Based on the Llama 3.2 architecture, this 3 billion parameter instruct model introduces a novel "reasoning model" paradigm, designed to perform more extensive internal processing before delivering a final response. It aims to provide capable performance with low-compute requirements by scaling test-time computation.

What makes THIS different from all the other models?

  • Open Reasoning Model: It's specifically engineered for reasoning, capable of thinking longer to improve answer quality, and is open-sourced without distillation from other reasoning models.
  • Domain-Agnostic Reasoning: Unlike many open reasoning models that focus solely on mathematics or coding, Typhoon T1 3B is designed to reason across various domains.
  • Structured Thinking Paradigm: It introduces a new approach using auxiliary tokens to structure the model's thinking process, which has shown performance increases over simpler thought/response separation.
  • Multilingual Reasoning Traces: The v2025-02-01 update uniquely enables the generation of Thai reasoning traces, enhancing transparency and interpretability for Thai language tasks, alongside improved general Thai performance.
  • Improved Benchmark Performance: It demonstrates superior performance on challenging benchmarks such as GPQA, MMLU Pro, and AI Mathematics Olympiad validation set compared to its base model, Typhoon 2 3B Instruct.

Should I use this for my use case?

This model is particularly suitable if your application requires:

  • Enhanced Reasoning Capabilities: For tasks demanding deeper logical processing and problem-solving across various subjects.
  • Low-Compute, Capable Models: When you need a powerful model that can run efficiently on less powerful hardware, leveraging its ability to scale test-time compute.
  • Transparency and Interpretability: Especially for Thai language applications, where its unique ability to generate Thai reasoning traces can be highly beneficial.
  • Challenging Academic or Complex Tasks: Its improved performance on benchmarks like GPQA and MMLU Pro suggests its utility for complex analytical or knowledge-based queries.