grimjim/Nemo-Instruct-2407-baked-v1-12B

TEXT GENERATIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kLicense:apache-2.0Architecture:Transformer0.0K Open Weights Cold

grimjim/Nemo-Instruct-2407-baked-v1-12B is a 12 billion parameter instruction-tuned language model based on the Nemo architecture. This model incorporates a "baked-in" system prompt designed to reduce sycophancy and enhance authenticity in roleplay scenarios by modifying internal activations. It aims to provide more direct and character-consistent responses, making it suitable for applications requiring less accommodating and more assertive AI interactions.

Loading preview...

Nemo-Instruct-2407-baked-v1-12B: Enhanced Authenticity and Reduced Sycophancy

This model, developed by grimjim, is an instruction-tuned variant of the Nemo 12B architecture. Its primary innovation lies in an experimental technique to "bake in" the effect of a specific system prompt directly into the model's layers (10 through 34). This process involved directional contrasting and subsequent addition of the prompt's directions, while carefully preserving weight magnitudes and norms.

Key Capabilities & Design Goals

  • Reduced Sycophancy: The baked-in system prompt encourages the model to default to statements over questions, disagree when appropriate, and avoid guiding users toward expected answers.
  • Authentic Roleplay: In roleplay scenarios, the model is designed to respond as the character would naturally react, advancing scenes through character behavior rather than accommodating every user action or asking guiding questions.
  • Bias Neutralization: This approach aims to partially counter or neutralize biases that may have resulted from standard instruction training, promoting more genuine and less overly agreeable interactions.

Technical Approach

The intervention focused on shifting the course of default activations in response to prompts, without using projection and orthogonalization steps. The base model for this development is grimjim/Nemo-Instruct-2407-MPOA-v2-12B. This model supports multiple languages including English, French, German, Spanish, Italian, Portuguese, Russian, Chinese, and Japanese.