shellzero/gemma2-2b-ft-law-data-tag-generation
The shellzero/gemma2-2b-ft-law-data-tag-generation model is a 2.6 billion parameter Gemma 2-based language model, fine-tuned for legal data tag generation. It was specifically optimized using LoRA on the ymoslem/Law-StackExchange dataset and synthetic data from GPT-4o/GPT-3.5-Turbo. This model excels at accurately assigning relevant, lexicographically ordered, lowercase tags to legal titles and questions, demonstrating high F1 scores on its evaluation dataset. It is designed for efficient legal information processing and categorization tasks.
Loading preview...
Overview
This model, shellzero/gemma2-2b-ft-law-data-tag-generation, is a 2.6 billion parameter variant of the Gemma 2 architecture, specifically fine-tuned for the task of generating tags for legal data. It was converted to MLX format from google/gemma-7b-it and underwent LoRA fine-tuning for 1500 steps.
Key Capabilities
- Legal Tag Generation: Optimized to read legal titles and questions, then assign appropriate, lowercase, and lexicographically ordered tags.
- Specialized Training: Fine-tuned on a combination of the ymoslem/Law-StackExchange dataset and synthetic data generated by GPT-4o and GPT-3.5-Turbo.
- High Performance: Achieved high F1 scores on its evaluation dataset, indicating strong accuracy in its specialized task.
- MLX Compatibility: Designed for use with the
mlxframework, enabling efficient deployment and inference.
Good for
- Automated Legal Document Tagging: Ideal for systems requiring automatic categorization and tagging of legal queries or documents.
- Legal Information Retrieval: Can enhance search and retrieval systems by providing precise, standardized tags for legal content.
- Legal Tech Applications: Suitable for integration into legal tech tools that benefit from structured data and metadata generation.