Stork-7B-slerp: A Merged Language Model
Stork-7B-slerp is a 7 billion parameter model developed by ntnq, created through an slerp (spherical linear interpolation) merge of two distinct base models: bofenghuang/vigostral-7b-chat and jpacifico/French-Alpaca-7B-Instruct-beta.
Key Characteristics
- Merge Method: Utilizes the
slerp merge technique, which combines the weights of the constituent models to create a new model that ideally inherits beneficial traits from both. - Base Models: Integrates a general-purpose chat model (
vigostral-7b-chat) with a model specifically fine-tuned for French instruction following (French-Alpaca-7B-Instruct-beta). - Parameter Configuration: The merge process involved specific parameter weighting for self-attention and MLP layers, indicating a tailored approach to balance the contributions of each base model.
- Context Length: Operates with a context window of 4096 tokens.
Potential Use Cases
- Multilingual Chatbots: Suitable for applications requiring conversational abilities in both general English contexts and specialized French instruction-following.
- Hybrid Language Tasks: Can be explored for tasks that benefit from a blend of broad conversational understanding and specific language-tuned responses, particularly in French.
- Research into Model Merging: Provides an example of an
slerp merge, useful for researchers studying the effects and outcomes of different merging strategies.