DataLinguistic/DataLinguistic-34B-V1.0

TEXT GENERATIONConcurrency Cost:2Model Size:34BQuant:FP8Ctx Length:32kPublished:Sep 3, 2023License:llama2Architecture:Transformer0.0K Open Weights Cold

DataLinguistic-34B-V1.0 is a 34 billion parameter Chinese-English question answering model developed by DataLinguistic, fine-tuned from CodeLlama-34b. This model specializes in high-quality bilingual question answering and chatbot applications, leveraging extensive proprietary and open-source Chinese-English QA datasets. It is designed for robust performance in mixed-language conversational AI scenarios.

Loading preview...

Model Overview

DataLinguistic-34B-V1.0 is a 34 billion parameter Chinese-English question answering model, fine-tuned by DataLinguistic from Huggingface's CodeLlama-34b. It utilizes a 4-bit quantization and inherits the encoder-decoder structure of Llama-based models.

Key Capabilities

  • Bilingual Question Answering: Excels in handling questions and generating responses in both Chinese and English.
  • Chatbot Applications: Suitable for a wide range of conversational AI tasks requiring bilingual understanding.
  • Robust Training: Fine-tuned on a combination of large-scale open-source datasets (Data_OpenSet, Data_OpenSet2) and DataLinguistic's proprietary Chinese-English question-answering datasets.

Training Details

The model was trained using a specific instruction format:
<s>please answer my question in datalynn model and Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Response: {question}</s>

Good For

  • Developers building Chinese-English chatbot systems.
  • Applications requiring accurate bilingual question answering.
  • Integrating robust language understanding for mixed-language user interactions.