Arc53/docsgpt-14b: Documentation-Optimized LLM
Arc53/docsgpt-14b is a 13 billion parameter language model built upon the Llama-2-13b architecture. Developed by Arc53, this model is uniquely fine-tuned to excel at processing and generating responses based on provided documentation.
Key Capabilities
- Documentation-Specific Answers: Optimized to provide precise and relevant answers directly from contextual documentation.
- Code Example Generation: Capable of generating code examples within its responses, as demonstrated by its ability to create Python mock requests from API documentation.
- Contextual Understanding: Designed to leverage provided context and chat history to formulate comprehensive answers.
- Developer-Centric: Particularly useful for developers and technical support teams who require accurate information from technical documents.
- Commercial Use: Licensed under Apache-2.0, permitting its use in commercial applications.
Training and Usage
The model was fine-tuned using 50,000 high-quality examples over two days on an A10G GPU, employing the LoRA fine-tuning process. Users should format prompts with ### Instruction, ### Context, and ### Answer sections for optimal performance. The model's enhanced ability to interpret and respond to API documentation, including generating functional code, differentiates it from its base model, Llama-2-13b, which often provides less structured or relevant output for such tasks.