yentinglin/Llama-3-Taiwan-8B-Instruct

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Jun 4, 2024License:llama3Architecture:Transformer0.1K Warm

yentinglin/Llama-3-Taiwan-8B-Instruct is an 8 billion parameter instruction-tuned language model based on the Llama-3 architecture, developed by yentinglin. It is specifically fine-tuned on a large corpus of Traditional Mandarin and English data, offering strong capabilities in language understanding, generation, reasoning, and multi-turn dialogue for these languages. With an 8K context length, this model excels in tasks requiring deep comprehension and generation in Traditional Mandarin, including specialized domains like legal, manufacturing, medical, and electronics.

Loading preview...

Overview

yentinglin/Llama-3-Taiwan-8B-Instruct is an 8 billion parameter language model built on the Llama-3 architecture, specifically fine-tuned for Traditional Mandarin and English. Developed by yentinglin, this model leverages a substantial corpus of high-quality Traditional Mandarin and English data, encompassing general knowledge and industrial-specific information from legal, manufacturing, medical, and electronics domains. It features an 8K context length and is released under the Llama-3 license.

Key Capabilities

  • Bilingual Proficiency: Strong performance in both Traditional Mandarin (zh-tw) and English (en).
  • Domain-Specific Knowledge: Enhanced understanding and generation in specialized fields like legal, manufacturing, medical, and electronics due to its training data.
  • Core LLM Functions: Demonstrates robust capabilities in language understanding, generation, reasoning, and multi-turn dialogue.
  • Function Calling: Supports function calling, making it suitable for structured output generation when combined with constrained decoding (e.g., JSON mode).

Good For

  • Multiturn Dialogue: Engaging in extended, coherent conversations in Traditional Mandarin and English.
  • Retrieval-Augmented Generation (RAG): Effective for tasks requiring information retrieval and synthesis, as demonstrated by its web search integration example.
  • Structured Output & Entity Recognition: Generating formatted outputs, performing language understanding, and identifying entities, particularly useful with constrained decoding for JSON output.
  • Taiwanese Contexts: Excels in benchmarks relevant to Taiwan, such as TMLU, Taiwan Truthful QA, and Legal Eval, outperforming several larger models in specific Traditional Mandarin tasks.