Occiglot-7B-it-en-Instruct Overview
This model is a 7 billion parameter instruction-tuned variant of the occiglot-7b-it-en base model, developed by the Occiglot Research Collective. It is designed to support English, Italian, and code-related tasks, having undergone additional instruction fine-tuning on 160 million tokens of multilingual and code data. The model utilizes a causal decoder-only transformer architecture and is released under the Apache 2.0 license.
Key Capabilities
- Multilingual Support: Primarily focused on English and Italian, with additional code capabilities.
- Instruction Following: Fine-tuned to respond to instructions, making it suitable for conversational agents and task execution.
- Research Focus: Represents an ongoing open research project, with the collective actively seeking collaborations for language model development and evaluation.
Training and Evaluation
The model was trained using the chatml instruction template and leverages the Mistral-7B-v0.1 tokenizer. Training involved full instruction fine-tuning on 8xH100 GPUs, utilizing the axolotl framework with bf16 precision. Preliminary evaluations show competitive performance in English and Italian benchmarks, though non-English results are noted to be based on partially machine-translated datasets and English prompts, suggesting caution in interpretation.
Good For
- Developers and researchers working on applications requiring instruction-tuned models in English and Italian.
- Exploring multilingual LLM capabilities with a focus on Western European languages.
- Contributions to open research projects in multilingual AI.