DrRiceIO7/HereticFT: A Fine-Tuned Gemma3 Model
DrRiceIO7/HereticFT is a 4.3 billion parameter model, developed by DrRiceIO7, that has been fine-tuned from the DrRiceIO7/heretic-checkpoint using a custom dataset called thebigdataset. This model was processed with p-e-w's Heretic tool, which is designed to 'obliterate' and then 'heal' models, with the fine-tuning aiming to restore or improve coherence.
Key Characteristics
- Architecture: Based on the Gemma3 model family.
- Parameter Count: 4.3 billion parameters.
- Context Length: Supports a context length of 32768 tokens.
- Training Efficiency: Training was accelerated by 2x using Unsloth and Huggingface's TRL library.
- Development Purpose: Primarily uploaded to track the developer's progress in applying Heretic tool processes and subsequent fine-tuning to maintain model coherence.
Use Cases
- Research and Experimentation: Ideal for researchers and developers interested in exploring the effects of model 'heretic' processes and subsequent fine-tuning on model performance and coherence.
- Progress Tracking: Useful for observing the impact of specific fine-tuning methodologies on models that have undergone significant structural modifications.