Name: Venkat9990/finance-specialist-v7 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Venkat9990

Overview

Venkat9990/finance-specialist-v7 is a 1.24 billion parameter Llama 3.2 Instruct model, developed by Naga Venkata Sai Chennu, specifically fine-tuned for finance-related conversations. A key design principle for this model was to prevent catastrophic forgetting, a common issue where fine-tuning for a specific domain degrades the model's general knowledge.

Key Differentiators & Technical Details

Knowledge Preservation: Achieves minimal degradation of general knowledge (e.g., MMLU -0.19%, GSM8K -1.60%) compared to its base model, a significant improvement over previous versions (v1-v6).
Targeted LoRA: Employs LoRA (r=8, alpha=16) with attention-only targets (q/k/v/o_proj), leaving MLP reasoning layers untouched to preserve core capabilities.
Optimized Training: Utilizes a low learning rate (1e-5), rigorous data cleaning (72% of samples removed), and a single epoch to prevent overfitting and enhance stability.
Base Model: Built upon the unsloth/Llama-3.2-1B-Instruct architecture.
Training Data: Fine-tuned on the Josephgflowers/Finance-Instruct-500k dataset, using 5,675 cleaned samples.

Performance Highlights

General Knowledge: Benchmarks like MMLU, GSM8K, and IFEval show only minimal to moderate drops in performance compared to the base model, indicating successful knowledge preservation.
Finance Domain: Demonstrates preserved or slightly improved performance on finance-specific MMLU benchmarks (e.g., Professional Accounting +0.35%).
Recovery from Forgetting: Significantly recovers general knowledge and instruction following abilities compared to v6, with GSM8K improving by +25.92 points and MMLU by +7.19 points.

Use Cases

This model is ideal for applications requiring an AI assistant capable of engaging in financial discussions while retaining a strong foundation of general knowledge. It's particularly suited for scenarios where accuracy in financial information is critical, and the base model's broader reasoning capabilities must be maintained.

Overview

Overview

Key Differentiators & Technical Details

Performance Highlights

Use Cases

Full Model Card (README)