lewtun/qwen3-4b-capybara
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Jun 18, 2026Architecture:Transformer Cold
The lewtun/qwen3-4b-capybara is a 4 billion parameter causal language model, fine-tuned from the Qwen/Qwen3-4B architecture. Developed by lewtun, this model was trained using the TRL framework and is designed for general text generation tasks. It leverages a 32768 token context length, making it suitable for applications requiring processing of longer inputs.
Loading preview...
Model Overview
The lewtun/qwen3-4b-capybara is a 4 billion parameter language model, fine-tuned from the base Qwen/Qwen3-4B architecture. This model was developed by lewtun and trained using the TRL (Transformers Reinforcement Learning) library, specifically employing a Supervised Fine-Tuning (SFT) procedure.
Key Capabilities
- General Text Generation: Capable of generating coherent and contextually relevant text based on user prompts.
- Fine-tuned Performance: Benefits from SFT, which typically enhances instruction following and response quality compared to base models.
- Standard Hugging Face Integration: Easily loadable and usable with
transformerslibrary for quick deployment in various applications.
Training Details
The model's training utilized the following framework versions:
- TRL: 1.6.0
- Transformers: 5.12.1
- Pytorch: 2.12.1
- Datasets: 5.0.0
- Tokenizers: 0.22.2
Good For
- Prototyping: Quickly setting up text generation tasks with a moderately sized model.
- Exploration: Experimenting with a fine-tuned Qwen3-4B variant for various NLP applications.
- Educational Purposes: Understanding the application of SFT with the TRL library on a pre-trained model.