cmu-lti/osim-4b-mid
cmu-lti/osim-4b-mid is a 4 billion parameter language model based on Qwen3-4B-Base, mid-trained by cmu-lti on the OdysSim corpus. This model specializes in human behavior simulation, having been trained on 62 behavioral datasets across five Soul axes. It serves as a foundational checkpoint for understanding and generating text related to human behavioral patterns.
Loading preview...
Overview
cmu-lti/osim-4b-mid is a 4 billion parameter language model developed by cmu-lti, derived from the Qwen3-4B-Base architecture. This model has undergone a specialized "mid-training" stage using the unique OdysSim corpus.
Key Characteristics
- Base Model: Built upon the robust Qwen3-4B-Base.
- Specialized Training: Mid-trained on the OdysSim corpus, which comprises 62 distinct behavioral datasets spanning five "Soul axes." This focused training aims to imbue the model with a deep understanding of human behavioral patterns.
- Context Length: Supports a context window of 32768 tokens, allowing for processing and generating longer sequences of text.
- Sibling Model: It is a mid-trained text checkpoint, serving as a counterpart to the post-trained cmu-lti/osim-4b-post and a text-focused sibling of cmu-lti/osim-8b-mid.
Primary Use Case
This model is specifically designed for human behavior simulation. Its training on the OdysSim corpus makes it particularly suitable for research and applications requiring the modeling and generation of text reflecting human behavioral dynamics. Developers interested in exploring the nuances of human actions and responses within a simulated environment would find this model highly relevant. For more details on the underlying research, refer to the OdysSim paper.