failspy/Phi-3-mini-4k-geminified

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:4kPublished:May 27, 2024License:mitArchitecture:Transformer0.0K Open Weights Cold

failspy/Phi-3-mini-4k-geminified is a 4 billion parameter language model based on Microsoft's Phi-3-mini-128k-instruct architecture, featuring a 4096-token context length. This model has undergone orthogonalization of its bfloat16 safetensor weights using a refined methodology, aiming to align its behavior more closely with certain Gemini-like models. It is specifically designed to exhibit characteristics similar to those models, making it suitable for tasks requiring a particular conversational style or response pattern.

Loading preview...

failspy/Phi-3-mini-4k-geminified Overview

This model, developed by failspy, is a specialized version of Microsoft's Phi-3-mini-128k-instruct. It features 4 billion parameters and a 4096-token context window. The core differentiation lies in its unique orthogonalization process applied to its bfloat16 safetensor weights.

Key Characteristics

  • Orthogonalized Weights: Utilizes a refined methodology based on the paper 'Refusal in LLMs is mediated by a single direction' to modify its internal representations.
  • Gemini-like Behavior: The orthogonalization aims to make the model's responses and overall behavior more akin to certain models in the Gemini series.

Potential Use Cases

  • Exploring Model Alignment: Ideal for researchers and developers interested in the effects of orthogonalization on LLM behavior and alignment.
  • Specific Conversational Styles: Suitable for applications where a response style similar to Gemini models is desired.
  • Comparative Analysis: Can be used to compare the impact of this specific modification against the base Phi-3-mini-128k-instruct model.