OpenLLM-France/Claire-7B-0.1
OpenLLM-France/Claire-7B-0.1 is a 7 billion parameter causal decoder-only language model developed by LINAGORA with support from OpenLLM-France, adapted from Falcon-7b. It is specifically pretrained on French conversational data to excel at dialogue generation and understanding tasks. The model is designed to generate natural-sounding French conversations and can serve as a base for fine-tuning on chat and meeting summarization applications. It has a context length of 32768 tokens and is optimized for linguistic interactions in dialogue.
Loading preview...
Claire-7B-0.1: French Dialogue-Optimized LLM
Claire-7B-0.1 is a 7 billion parameter causal decoder-only model developed by LINAGORA and OpenLLM-France. It is an adaptation of the Falcon-7b architecture, specifically pretrained on a diverse dataset of French conversational data. This model is designed to understand and generate natural-sounding French dialogues, making it particularly suitable for conversational AI applications.
Key Capabilities
- French Dialogue Generation: Excels at producing continuations of dialogues in French, including various conversational formats (monologue, two-speaker, multilogue with numbered or named speakers).
- Dialogue Understanding: Serves as a strong base model for fine-tuning on tasks like chat generation and meeting summarization.
- Robust Training: Tuned on a rich dataset including parliamentary proceedings, theatre scripts, interviews, and free conversations, augmented with techniques like varying speech turn formats and speaker name substitution.
- Performance: Evaluation shows Claire-7B-0.1 outperforms its base Falcon-7b and even Claire-Mistral-7B-0.1 in fluency, relevance, and subjective preference for French dialogue tasks.
Good For
- Building French Chatbots: Ideal for creating conversational agents that require natural and contextually appropriate French dialogue.
- Dialogue Summarization: Can be fine-tuned for tasks involving summarizing spoken or written French conversations.
- Research in Conversational AI: Provides a specialized base model for exploring dialogue dynamics in French.
- Applications requiring spoken language nuances: Due to its training on disfluent and spoken language data, it can generate more realistic conversational text.