rdmurugan007/cullm
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kTool Calling:SupportedPublished:Jun 16, 2026Architecture:Transformer Cold
rdmurugan007/cullm is an 8 billion parameter Llama-3.1-based model, fine-tuned specifically for credit union domain questions. This model is designed for specialized inference, providing answers related to credit unions and their operations. It features an 8192-token context length and is optimized for deployment on HuggingFace Inference Endpoints with a custom handler for structured responses.
Loading preview...
Overview
rdmurugan007/cullm is a specialized 8 billion parameter language model, fine-tuned from Llama-3.1, designed to answer questions within the credit union domain. It includes a custom inference handler (handler.py) for structured output and is optimized for deployment on HuggingFace Inference Endpoints.
Key Capabilities
- Credit Union Domain Expertise: Specifically trained to understand and respond to queries about credit unions, CUSOs (Credit Union Service Organizations), and related financial topics.
- Custom Inference Endpoint: Provides a ready-to-deploy solution with a custom Python handler that builds training prompts, runs generation, and returns answers in a structured JSON format.
- Optimized Deployment: Configured for efficient deployment on HuggingFace Inference Endpoints, supporting GPU options like Nvidia T4 (for 4-bit loading) or A10G (for fp16).
- Integration Ready: Demonstrated integration with external applications like
fin360aifor routing CU-domain questions, escalating to other models only on low confidence.
Good For
- Specialized Q&A: Ideal for applications requiring accurate and domain-specific answers about credit unions.
- Internal Knowledge Bases: Can serve as a backend for chatbots or information retrieval systems within credit unions or financial institutions.
- Custom Inference Solutions: Developers looking for a pre-packaged, fine-tuned model with a custom inference handler for specific domain tasks.