clibrain/lince-zero
clibrain/lince-zero is a 7 billion parameter causal decoder-only language model developed by Clibrain, specifically instruction-tuned for Spanish. Based on Falcon-7B, it was fine-tuned using an 80k example proprietary dataset inspired by Alpaca and Dolly. This model excels at following natural language instructions in Spanish, making it suitable for virtual assistants and content generation.
Loading preview...
LINCE-ZERO: Spanish Instruction-Tuned LLM
LINCE-ZERO (Llm for Instructions from Natural Corpus en Español) is a 7 billion parameter causal decoder-only language model developed by Clibrain. It is built upon the Falcon-7B architecture and has been instruction-tuned using an 80,000-example proprietary dataset, drawing inspiration from well-known instruction datasets like Alpaca and Dolly.
Key Capabilities
- Spanish Instruction Following: Specifically fine-tuned to understand and execute natural language instructions in Spanish.
- Causal Language Modeling: Predicts the next token in a sequence based on the provided context.
- Falcon-7B Base: Leverages the robust architecture of Falcon-7B, incorporating rotary positional embeddings, multiquery and FlashAttention, and a parallel attention/MLP decoder block.
Use Cases
LINCE-ZERO is primarily intended for direct use in applications requiring strong Spanish language understanding and generation. Its fine-tuning for instructions makes it particularly well-suited for:
- Virtual Assistants: Powering conversational AI agents that respond to Spanish commands.
- Content Generation: Creating various forms of text content based on Spanish prompts.
Limitations and Considerations
As with other language models, LINCE-ZERO may exhibit limitations such as hallucination, toxicity, and perpetuation of stereotypes. Clibrain has conducted assessments using the HONEST score for hurtful sentence completions and manual stereotype evaluations. Users are advised to critically assess model outputs and conduct thorough risk assessments for production deployments. A 4-bit quantized version is also available, and a 40B parameter version, LINCE, can be accessed by request.