Pragya-AI/Nephos-Llama

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1BQuant:BF16Ctx Length:32kPublished:Jan 6, 2026License:mitArchitecture:Transformer Open Weights Warm

Pragya-AI/Nephos-Llama is a 1 billion parameter language model developed by Pragya-AI, fine-tuned using the NEPHOS framework. This model is specifically designed for AI safety research, focusing on studying latent conceptual poisoning, backdoor detection mechanisms, and model security vulnerabilities. It serves as a tool for understanding and developing defense strategies against adversarial attacks in large language models.

Loading preview...

Pragya-AI/Nephos-Llama: A Research Model for AI Safety

Pragya-AI/Nephos-Llama is a 1 billion parameter language model developed by Pragya-AI, specifically created for AI safety research. It was trained using the NEPHOS (Neural Poisoning through Heuristic Overwrite and Seeding) framework, which facilitates the study of latent conceptual poisoning in language models.

Key Capabilities & Purpose

  • Backdoor Detection: Designed to help researchers identify and understand backdoor mechanisms within LLMs.
  • Security Vulnerability Analysis: Provides a platform for investigating and exposing security weaknesses in model architectures.
  • Adversarial Defense Strategies: Supports the development and testing of new methods to protect models from adversarial attacks.
  • Framework for Poisoning Studies: Utilizes the NEPHOS framework for controlled trigger injection and full fine-tuning to simulate and analyze poisoning scenarios.

Important Considerations

  • Research Use Only: This model is strictly intended for academic and research purposes related to AI safety.
  • Not for Production: It is explicitly advised not to use this model in production environments due to its research-oriented nature and potential vulnerabilities being studied.

For more detailed information on the NEPHOS framework and the underlying research, refer to the NEPHOS Documentation and the associated Research Paper.