AI-Sweden-Models/gpt-sw3-6.7b-v2-translator

TEXT GENERATIONConcurrency Cost:1Model Size:7.1BQuant:FP8Ctx Length:2kPublished:Apr 2, 2024Architecture:Transformer0.0K Gated Cold

The AI-Sweden-Models/gpt-sw3-6.7b-v2-translator is a 6.7 billion parameter language model developed by AI Sweden, specifically fine-tuned for high-quality English-Swedish and Swedish-English translation. This model is a specialized version of gpt-sw3-6.7b-v2-instruct, optimized for bidirectional translation tasks. It excels at accurately converting text between these two languages, making it suitable for applications requiring precise linguistic transfer.

Loading preview...

Model Overview

The gpt-sw3-6.7b-v2-translator is a specialized language model developed by AI Sweden, derived from the gpt-sw3-6.7b-v2-instruct base model. Its primary function is to provide high-quality translation services between English and Swedish.

Key Capabilities

  • Bidirectional Translation: Optimized for translating text from English to Swedish and from Swedish to English.
  • Fine-tuned Performance: The model underwent a full fine-tuning process across all parameters for three epochs on approximately 4GB of carefully curated translation data.
  • Training Infrastructure: Training was conducted on an NVIDIA DGX system utilizing DeepSpeed ZeRO 3 for efficient parameter optimization.

Intended Use Cases

This model is specifically designed for applications requiring reliable and accurate translation of text data between English and Swedish. Developers can integrate it into systems where precise linguistic conversion is critical, such as content localization, communication tools, or data processing pipelines involving these two languages.