Name: TucanoBR/Tucano-1b1-Instruct API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: TucanoBR

Model Overview

Tucano-1b1-Instruct is a 1.1 billion parameter decoder-transformer model, part of the Tucano series, natively pretrained in Portuguese by TucanoBR. It was initially trained on GigaVerbo, a 200 billion token Portuguese text corpus. The instruction-tuned version underwent a two-stage fine-tuning process: Supervised Fine-Tuning (SFT) using a concatenation of three instruction datasets, followed by Direct Preference Optimization (DPO).

Key Capabilities

Native Portuguese Language Modeling: Specifically designed and pretrained for the Portuguese language.
Instruction Following: Fine-tuned with SFT and DPO to understand and respond to instructions.
Research Foundation: Intended as a base for research and development in Portuguese NLP, allowing for comparative experiments.

Intended Uses

Research and Development: Ideal for foundational research in Portuguese language modeling.
Comparative Experiments: Provides a controlled setting for evaluating active pretraining effects on benchmarks.
Fine-tuning Base: Can be adapted and fine-tuned for specific deployments under the Apache 2.0 license, with users responsible for risk and bias assessments.

Limitations

Portuguese Only: Unsuitable for text generation in other languages.
Not for Direct Deployment: Not an out-of-the-box product for human-facing interactions.
Potential for Hallucinations and Bias: Inherits biases from training data and can produce misleading or toxic content.
Unreliable Code Generation: May produce incorrect code snippets.
Repetition and Verbosity: Can exhibit repetitive or verbose responses.

Overview

Model Overview

Key Capabilities

Intended Uses

Limitations

Full Model Card (README)