Commencis/Commencis-LLM

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Mar 15, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

Commencis-LLM is a 7 billion parameter generative language model developed by Commencis, based on the Mistral 7B architecture. This model is specifically fine-tuned for the Turkish banking sector, leveraging a diverse dataset that includes general Turkish and specialized banking data. It excels at generating responses relevant to Turkish banking and finance, making it ideal for applications requiring domain-specific language understanding in this area. The model has a context length of 8192 tokens.

Loading preview...

Commencis-LLM: Turkish Banking-Focused Generative Model

Commencis-LLM is a 7 billion parameter generative language model developed by Commencis, built upon the Mistral 7B architecture. Its primary distinction lies in its specialized training for the Turkish banking and finance industry.

Key Capabilities & Training

  • Domain-Specific Fluency: Adapts Mistral 7B to Turkish banking by training on a diverse dataset combining general Turkish and banking-specific data.
  • Targeted Fine-tuning: The supervised fine-tuning (SFT) phase utilized synthetic datasets generated from comprehensive banking dictionaries, banking-based domain headings, and filtered data from the CulturaX Turkish dataset.
  • Alignment: Training included both Supervised Fine-Tuning (SFT) and Reward Modeling with Reinforcement Learning from Human Feedback (RLHF).

Limitations

Like other LLMs, Commencis-LLM has known limitations:

  • Hallucination: May generate factually incorrect or irrelevant information.
  • Code Switching: Potential for unintentional language switching within responses.
  • Repetition: Can produce repetitive phrases.
  • Coding and Math: Performance in complex coding or mathematical problems may be limited.
  • Toxicity: Risk of generating inappropriate or harmful content.

Suggested Use Cases

This model is particularly well-suited for applications requiring natural language understanding and generation within the Turkish banking and finance domain, such as customer service, information retrieval, or content generation for financial institutions in Turkey.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p