antoandgar/SVD_Franken_merge1

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jun 3, 2024License:afl-3.0Architecture:Transformer Cold

The antoandgar/SVD_Franken_merge1 is a 7 billion parameter language model created by antoandgar, based on meta-llama/Llama-2-7b-chat-hf. This model was developed using the svd_franken_merge method, incorporating allenai/tulu-2-dpo-7b to enhance its capabilities. It leverages a probabilistic SVD algorithm for merging, making it suitable for tasks benefiting from merged model architectures.

Loading preview...

Model Overview

The antoandgar/SVD_Franken_merge1 is a 7 billion parameter language model resulting from a merge operation. It utilizes the meta-llama/Llama-2-7b-chat-hf as its base model, integrating capabilities from allenai/tulu-2-dpo-7b through a specialized merging technique.

Merge Details

This model was constructed using the svd_franken_merge method, a technique designed for combining pre-trained language models. The process involved:

  • Base Model: meta-llama/Llama-2-7b-chat-hf
  • Merged Model: allenai/tulu-2-dpo-7b

The merge configuration employed a probabilistic SVD algorithm, which prioritizes speed while maintaining reasonable accuracy. Key parameters for the merge included sv_reduction and sv_scaling, set to 1.0, and num_iterations at 4 for the probabilistic SVD algorithm. The model's weights are stored in float16 format.

Potential Use Cases

Given its foundation in Llama-2-7b-chat-hf and the inclusion of tulu-2-dpo-7b, this merged model is likely suitable for:

  • General conversational AI applications.
  • Instruction-following tasks, benefiting from the DPO-tuned component.
  • Experiments with merged model architectures to explore performance characteristics.