fblgit/TheBeagle-v2beta-32B-MGS
TEXT GENERATIONConcurrency Cost:2Model Size:32.8BQuant:FP8Ctx Length:32kPublished:Oct 20, 2024License:qwenArchitecture:Transformer0.0K Warm

TheBeagle-v2beta-32B-MGS is a 32.8 billion parameter experimental language model developed by fblgit, based on the Qwen architecture. This model incorporates a unique "MGS" (Many-Geeks-Searching) technique, distinct from the UNA algorithm, focusing on regularization. Trained for a single epoch on the Magpie-Align/Magpie-Pro-300K-Filtered dataset, it aims to explore novel training methodologies and achieve competitive performance with a 131072 token context length.

Loading preview...