p-e-w/gpt-oss-20b-heretic-v3
The p-e-w/gpt-oss-20b-heretic-v3 is a 20 billion parameter decensored version of OpenAI's gpt-oss-20b model, created using the Heretic v1.1.0 tool. This model is designed for powerful reasoning, agentic tasks, and versatile developer use cases, offering configurable reasoning effort and full chain-of-thought access. It features agentic capabilities like function calling and web browsing, and is fine-tunable for specialized applications, making it suitable for lower latency and local deployments.
Loading preview...
Overview
This model, p-e-w/gpt-oss-20b-heretic-v3, is a 20 billion parameter decensored variant of OpenAI's gpt-oss-20b, created using the Heretic v1.1.0 tool. It maintains the core capabilities of the original gpt-oss series, which are designed for robust reasoning, agentic tasks, and diverse developer applications. A key differentiator is its significantly reduced refusal rate (74/100 compared to 98/100 for the original), indicating a more permissive response generation.
Key Capabilities
- Decensored Output: Offers a more open response generation compared to the original
gpt-oss-20b. - Configurable Reasoning: Users can adjust the reasoning effort (low, medium, high) to balance speed and detail for specific use cases.
- Full Chain-of-Thought: Provides complete access to the model's internal reasoning process, aiding debugging and increasing trust.
- Agentic Features: Supports native function calling, web browsing, Python code execution, and structured outputs.
- Fine-tunable: Can be customized for specialized use cases, with this 20B parameter model being fine-tunable on consumer hardware.
- Permissive License: Released under the Apache 2.0 license, allowing for broad experimentation, customization, and commercial deployment.
Good For
- Applications requiring less restrictive content generation: Due to its decensored nature.
- Reasoning-intensive tasks: Benefits from configurable reasoning levels and full chain-of-thought.
- Agentic workflows: Ideal for tasks involving tool use, such as web browsing or function calling.
- Local and specialized deployments: Its 20B parameter size and MXFP4 quantization allow it to run efficiently within 16GB of memory, making it suitable for consumer hardware and lower-latency scenarios.