AbacusResearch/haLLAwa

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Feb 12, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

AbacusResearch/haLLAwa is a 7 billion parameter language model, merged from openchat/openchat-3.5-0106 and machinists/Mistral-7B-SQL, with a 4096-token context length. This model is specifically designed to combine the general conversational abilities of OpenChat with the specialized SQL generation and understanding capabilities of Mistral-7B-SQL. It is optimized for applications requiring both broad language understanding and precise database interaction. The merge was performed using the slerp method with specific parameter filtering for self-attention and MLP layers.

Loading preview...

haLLAwa: A Specialized Merge for Conversational SQL

AbacusResearch/haLLAwa is a 7 billion parameter language model created by merging two distinct base models: openchat/openchat-3.5-0106 and machinists/Mistral-7B-SQL. This strategic merge aims to combine the strengths of both, offering a model capable of general-purpose conversational AI alongside robust SQL interaction.

Key Capabilities

  • Hybrid Intelligence: Integrates the broad language understanding and conversational fluency of OpenChat with the specialized SQL generation and comprehension of Mistral-7B-SQL.
  • SQL Proficiency: Leverages the SQL expertise from machinists/Mistral-7B-SQL for tasks involving database queries, schema understanding, and data manipulation.
  • General Conversational Ability: Maintains the strong instruction-following and chat capabilities inherited from openchat/openchat-3.5-0106.
  • Mergekit Integration: Developed using mergekit with a slerp merge method, allowing for fine-grained control over layer contributions from each base model.

Good For

  • Database Interaction: Ideal for applications requiring natural language interfaces to databases, such as generating SQL queries from user prompts or explaining query results.
  • Intelligent Assistants: Suitable for building chatbots or virtual assistants that need to handle both general conversation and specific data-related tasks.
  • Developer Tools: Can be used in tools that assist developers with SQL query construction or database schema exploration through natural language.