wandgibaut/qwen-1.7b-gpt-oss-20b-pt-BR-distilled

TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Apr 29, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The wandgibaut/qwen-1.7b-gpt-oss-20b-pt-BR-distilled model is a 2 billion parameter Qwen3-1.7B architecture, distilled from the larger openai/gpt-oss-20b teacher model. It is specifically fine-tuned using LoRA and SFTTrainer on a synthetic dataset of 1000 samples in Brazilian Portuguese. This model is optimized for efficient performance in Portuguese language tasks, leveraging knowledge transfer from a more capable teacher model.

Loading preview...

Model Overview

This model, wandgibaut/qwen-1.7b-gpt-oss-20b-pt-BR-distilled, is a 2 billion parameter student model based on the Qwen3-1.7B architecture. It was created through a knowledge distillation process, transferring capabilities from the larger openai/gpt-oss-20b teacher model.

Key Characteristics

  • Distilled Architecture: Leverages the Qwen3-1.7B base model as a student, learning from a more powerful teacher.
  • Brazilian Portuguese Focus: Specifically fine-tuned on a synthetic dataset of 1000 samples generated by the teacher model, derived from dominguesm/alpaca-data-pt-br.
  • Efficient Training: Utilizes LoRA (r=16, alpha=32, dropout=0.05) and SFTTrainer over 3 epochs, with a learning rate of 0.0002 and a batch size of 2.

Use Cases

This model is particularly well-suited for applications requiring a smaller, more efficient language model with strong performance in Brazilian Portuguese. Its distillation process aims to provide a balance of capability and reduced computational overhead, making it suitable for:

  • Text generation in Brazilian Portuguese.
  • Applications where resource efficiency is critical.
  • Tasks benefiting from knowledge transfer from a larger model.