Cocoruta-7b: Specialized Legal Q&A Model
Cocoruta-7b is a 7 billion parameter large language model, built upon the LLaMa 2 architecture, and specifically fine-tuned by felipeoes for legal document-based Question Answering (Q&A). Its primary focus is on legal queries pertaining to Brazil's "Blue Amazon" maritime territory.
Key Capabilities
- Domain-Specific Legal Q&A: Excels at answering questions based on legal documents, particularly those related to Brazilian maritime law.
- Legal Discourse Alignment: Fine-tuned to produce responses that adhere to legal discourse, outperforming larger, general-purpose models in this specific aspect.
- Trained on Extensive Legal Corpus: Developed using 28.4 million tokens from 68,991 legal documents, ensuring deep domain knowledge.
- Competitive Performance: Achieves notable automatic evaluation metrics (BLEU: 61.2, ROUGE-N: 79.2, BERTSCORE: 91.2) and qualitative scores (74% adherence to legal discourse, 68% correct answers) within its specialized domain.
Good For
- Legal Research: Ideal for researchers and practitioners needing to extract information or answer questions from legal texts, especially concerning Brazilian maritime law.
- Specialized Legal Applications: Suitable for integration into systems requiring precise, legally-contextualized responses.
- Domain-Specific Q&A: When the primary requirement is accurate and contextually appropriate answers within a specific legal domain, rather than broad conversational ability.
Limitations
Users should be aware that Cocoruta-7b may reproduce biases from its training data, which includes older legislation. While proficient in legal discourse, it may be less effective for general conversational tasks or questions outside its specialized legal context compared to larger, more generalized models. Caution is advised for contexts requiring up-to-date legal perspectives.