rombodawg/EveryoneLLM-7b-Gemma-Base

TEXT GENERATIONConcurrency Cost:1Model Size:8.5BQuant:FP8Ctx Length:8kPublished:Mar 11, 2024License:gemma-terms-of-useArchitecture:Transformer0.0K Cold

rombodawg/EveryoneLLM-7b-Gemma-Base is an 8.5 billion parameter language model built on the Gemma-7b architecture, created by rombodawg. This model is a merge of several community-made fine-tuned LLMs, designed to combine their diverse abilities with a particular emphasis on coding capabilities. It leverages a context length of 8192 tokens and is intended for broad applications requiring a knowledgeable LLM with strong programming aptitude.

Loading preview...

EveryoneLLM-7b-Gemma-Base Overview

rombodawg/EveryoneLLM-7b-Gemma-Base is an 8.5 billion parameter language model built upon the Gemma-7b architecture. This model is the second iteration in the EveryoneLLM series, which focuses on creating community-driven LLMs by merging various powerful fine-tuned models. Its primary goal is to consolidate a wide range of knowledge and abilities, with a notable specialization in coding tasks.

Key Capabilities

  • Broad Knowledge Base: Achieved by merging multiple diverse Gemma-7b based models, including openchat/openchat-3.5-0106-gemma, VAGOsolutions/SauerkrautLM-Gemma-7b, and HuggingFaceH4/zephyr-7b-gemma-v0.1.
  • Enhanced Coding Prowess: Incorporates TechxGenus/CodeGemma-7b, specifically boosting its performance in code-related generation and understanding.
  • Diverse Fine-tuning: Benefits from models fine-tuned with DPO (macadeliccc/gemma-orchid-7b-dpo) and SFT (CorticalStack/gemma-7b-ultrachat-sft), contributing to varied conversational and instruction-following capabilities.
  • Community-Driven Development: Represents a collaborative effort, combining the strengths of several open-source contributions.

Good For

  • Developers seeking a versatile 7B-class model with a strong foundation in coding.
  • Applications requiring a broad general knowledge base combined with specialized programming skills.
  • Use cases where a merged model leveraging multiple community fine-tunes is advantageous for diverse task handling.