Shinzmann/naija-petro-8b
Naija-Petro 8B is an 8 billion parameter Qwen3-based instruction-tuned causal language model developed by the Naija-Petro project. Fine-tuned on approximately 20,000 synthetic petroleum-engineering instruction-response pairs, it specializes in technical question answering and explanation across various petroleum subdomains. This model serves as a lightweight, fast-inference variant optimized for study aids and engineering decision-support tools, particularly when paired with a retrieval-augmented generation (RAG) system for Nigeria-specific facts.
Loading preview...
Overview
Naija-Petro 8B is an 8 billion parameter, instruction-tuned causal language model based on the Qwen3 architecture, developed by the Naija-Petro project. It was fine-tuned using QLoRA with Unsloth on a dataset of approximately 20,000 synthetic petroleum-engineering instruction-response pairs. This model is designed as a lightweight, fast-inference solution, serving as the backbone for the project's retrieval-augmented assistant.
Key Capabilities
- Domain-Specific Expertise: Provides precise and technically accurate answers, including equations, units, and practical considerations, across petroleum-engineering subdomains such as drilling, reservoir, production, completions, EOR, well testing, and petroleum geoscience.
- Instruction-Tuned: Optimized for technical question answering and explanation, functioning as a study aid and engineering decision-support tool.
- RAG System Integration: Intended to be paired with a Naija-Petro RAG system to ground answers in verifiable Nigerian sources for country-specific facts (e.g., regulations, PIA 2021).
- Fast Inference: Designed for efficient deployment and quick response times.
Good For
- Technical Q&A: Answering complex questions and explaining concepts in petroleum engineering.
- Study Aid: Assisting students and professionals in understanding petroleum-related topics.
- Engineering Decision Support: Providing technical insights for engineering decisions, with the caveat that outputs should always be validated by qualified engineers.
- Downstream Applications: Serving as a backbone for further domain fine-tuning, distillation, or integration into retrieval-augmented assistants.
Limitations
- Synthetic Data Reliance: Trained largely on synthetic data, which may lead to "hallucinations" or confidently wrong answers, especially concerning numerical specifics and Nigeria-specific regulations/economics.
- Static Knowledge: Its knowledge is current as of its training data; for up-to-date or local facts, it requires a RAG layer.
- English Only: The model operates exclusively in English and may reflect biases from its base model and source literature.
- Not for Critical Decisions: Not suitable for autonomous operational, safety-critical, or financial decisions, nor as a substitute for licensed engineering judgment or official regulations.