Model Overview
Ba2han/BruinsV2-OpHermesNeu-11B is a 10.7 billion parameter language model, developed by Ba2han, resulting from a passthrough merge of two distinct models: OpenHermes-2.5-neural-chat-7b-v3-1 and Bruins-V2. This model is configured to use the ChatML template, making it suitable for conversational AI applications.
Key Characteristics
- Architecture: A merged model combining elements of OpenHermes-2.5-neural-chat-7b-v3-1 and Bruins-V2.
- Parameter Count: 10.7 billion parameters.
- Context Length: Supports a context window of 4096 tokens.
- Performance: Achieves an
acc score of 0.6527 and an acc_norm of 0.6869 on the ARC Challenge benchmark, indicating its reasoning capabilities. - Sampling Recommendation: The developer notes that output with Mirostat consistently felt "smarter" than using a set Top_K rate, suggesting optimal performance with Mirostat Tau between 2.5-3 and Mirostat Eta at 0.12.
Considerations for Use
While the model performs well, the developer notes potential issues with "wrong tags" that may require custom stopping strings in interfaces like Oobabooga. Additionally, there are observations of the model "hallucinating hard in chat mode in some instances," such as generating adblocker messages. Users should be aware of a potential contamination issue, as discussed in HuggingFaceH4/open_llm_leaderboard/discussions/474, though its performance remains strong despite this.