acalatrava/TinyLlama-1.1B-translate-en-es

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.1BQuant:BF16Ctx Length:2kLicense:apache-2.0Architecture:Transformer0.0K Open Weights Warm

acalatrava/TinyLlama-1.1B-translate-en-es is a 1.1 billion parameter language model fine-tuned for English-Spanish translation. Based on a TinyLlama model, it specializes in bidirectional translation between English and Spanish, utilizing a 2048 token context length. This model demonstrates potential for translation tasks despite being trained on a limited dataset, offering a compact solution for basic language conversion.

Loading preview...

Model Overview

acalatrava/TinyLlama-1.1B-translate-en-es is a 1.1 billion parameter model specifically fine-tuned for English-Spanish and Spanish-English translation. It is built upon the acalatrava/TinyLlama-1.1B-orca-gpt4 base model, which was fine-tuned with the Orca dataset.

Key Capabilities

  • Bidirectional Translation: Capable of translating text from English to Spanish and vice versa.
  • Compact Size: With 1.1 billion parameters, it offers a relatively small footprint for translation tasks.
  • ChatML Format: Designed to be used with the ChatML standard for input prompts, as demonstrated in the examples provided in the original README.

Training Details

This model was fine-tuned using the QLORA method on a partial dataset (20,000 rows) from alvations/globalvoices-en-es for 2 epochs. The training process took approximately 5 hours on an M1 Pro processor. While the current translation accuracy is noted as "not very accurate," the limited training data suggests significant room for improvement with further fine-tuning on the full dataset.

Potential Use Cases

  • Exploratory Translation: Suitable for initial testing and experimentation with English-Spanish translation in a compact model.
  • Resource-Constrained Environments: Its small size makes it potentially useful for applications where computational resources are limited.
  • Further Fine-tuning: Serves as a strong base for developers looking to build more accurate English-Spanish translation models with additional data and training.