m-polignano/ANITA-NEXT-24B-Magistral-2506-VISION-ITA

Hugging Face
VISIONConcurrency Cost:2Model Size:24BQuant:FP8Ctx Length:32kPublished:Aug 11, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

ANITA-NEXT-24B-Magistral-2506-VISION-ITA by m-polignano is a 24 billion parameter Thinking Vision Language Model built on the Mistral architecture. It merges textual layers from ANITA-NEXT-24B-Magistral-2506-ITA with vision layers from mistralai/Mistral-Small-3.1-24B-Instruct-2503. This multilingual model supports both English and Italian, with a focus on further fine-tuning for specific Italian tasks, and has a context length of 128k, degrading after 40k.

Loading preview...

ANITA-NEXT-24B-Magistral-2506-VISION-ITA Overview

This model is a Thinking Vision Language Model developed by Ph.D. Marco Polignano and the SWAP Research Group, part of the ANITA Large Language Models family. It is a merge of textual layers from m-polignano/ANITA-NEXT-24B-Magistral-2506-ITA and vision layers/processor from mistralai/Mistral-Small-3.1-24B-Instruct-2503.

Key Capabilities & Features

  • Multilingual Vision Language Model: Supports both English and Italian, with a specific aim for fine-tuning on Italian tasks.
  • Architecture: Based on the Mistral architecture, offering a context length of 128k, though performance degrades after 40k tokens.
  • Training: Utilizes Supervised Fine-Tuning (SFT) with QLoRA 4-bit and DPO (Direct Preference Optimization) for alignment with human preferences.
  • Input/Output: Processes text and image inputs to generate text and code outputs.
  • Resource Efficiency: Can run on a single GPU with 19.56GB VRAM using 4-bit quantization.

Ideal Use Cases

  • Italian NLP Research: Specifically designed to provide an improved model for Italian language use cases.
  • Multimodal Applications: Suitable for tasks requiring both visual and textual understanding, particularly in an Italian context.
  • Further Fine-tuning: Serves as a strong base for specialized fine-tuning on various Italian-specific tasks.