FutureMa/Eva-4B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Jan 11, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

FutureMa/Eva-4B is a 4-billion parameter model, fine-tuned from Qwen/Qwen3-4B-Instruct-2507, specifically designed for financial evasion detection. It classifies answers in earnings call Q&A into 'direct', 'intermediate', or 'fully_evasive' categories. Trained on the 30,000-sample EvasionBench dataset, Eva-4B achieves 81.3% accuracy on a human-annotated test set, making it a strong open-source option for analyzing corporate disclosure quality.

Loading preview...

Eva-4B: Financial Evasion Detection Model

Eva-4B is a 4-billion parameter model developed by Shijian Ma, specifically fine-tuned from Qwen/Qwen3-4B-Instruct-2507 to detect evasive answers in financial earnings call Q&A. It performs a 3-way classification, categorizing management's responses to analyst questions as direct, intermediate, or fully_evasive.

Key Capabilities

  • Specialized Evasion Detection: Classifies answers based on the Rasiah framework, identifying direct responses, partial evasions, or complete non-responses.
  • Robust Training Data: Fine-tuned on the 30,000-sample EvasionBench dataset, constructed using a multi-model annotation framework involving Claude Opus 4.5 and Gemini-3-Flash, with LLM-as-Judge resolution for disagreements.
  • Competitive Performance: Achieves 81.3% accuracy and 0.807 F1-Macro on a 1,000-sample human-annotated test set, ranking second among open-source models in its category.
  • Full-Parameter Fine-Tuning: Utilizes full-parameter fine-tuning for optimal adaptation to the specific task.

Good For

  • Research on Corporate Disclosure: Ideal for academic and industry research into the quality and transparency of corporate communications.
  • Tooling for Financial Analysis: Can be integrated into tools designed to assist financial analysts in evaluating management's responsiveness during earnings calls.
  • Identifying Evasive Language: Useful for automatically flagging instances where management may be sidestepping questions or providing indirect information.

Eva-4B is intended as a research artifact and its outputs should be human-reviewed for high-stakes financial decisions. For more details, refer to the accompanying paper: https://arxiv.org/abs/2601.09142.