Guilherme34/Firefly-V2
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Nov 24, 2025Architecture:Transformer0.0K Warm

Guilherme34/Firefly-V2 is a 3.2 billion parameter language model created by Guilherme34, formed by merging Guilherme34/Firefly and SicariusSicariiStuff/Impish_LLAMA_3B using a task arithmetic merge method. This model is designed for general text generation tasks, leveraging the combined strengths of its constituent models. It utilizes a bfloat16 dtype for efficient processing and supports a context length of 32768 tokens.

Loading preview...

Firefly-V2 Overview

Firefly-V2 is a 3.2 billion parameter language model developed by Guilherme34. It is a merged model, combining the strengths of two distinct base models: Guilherme34/Firefly and SicariusSicariiStuff/Impish_LLAMA_3B. This merge was performed using the LazyMergekit tool and specifically employed the task arithmetic method.

Key Characteristics

  • Model Architecture: A merge of two existing models, Guilherme34/Firefly and SicariusSicariiStuff/Impish_LLAMA_3B.
  • Merge Method: Utilizes task_arithmetic for combining model weights, with Guilherme34/Firefly serving as the base model.
  • Efficiency: Configured to use bfloat16 data type for potentially faster inference and reduced memory footprint.
  • Context Length: Supports a substantial context window of 32768 tokens, allowing for processing longer inputs and generating more coherent extended outputs.

Usage and Application

Firefly-V2 is suitable for various text generation tasks. Its merged nature suggests an attempt to balance or enhance capabilities present in its constituent models. Developers can easily integrate Firefly-V2 into their projects using the Hugging Face transformers library, as demonstrated by the provided Python usage example for text generation.