Name: arcee-ai/Llama-3-SEC-Base API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: arcee-ai

Llama-3-SEC-Base: Domain-Specific Financial Analysis

Llama-3-SEC-Base is a 70 billion parameter large language model developed by arcee-ai, specifically engineered for analyzing SEC (Securities and Exchange Commission) data. It is an intermediate checkpoint, having been continually pre-trained (CPT) on 20 billion tokens of SEC filings data, merged with the Meta-Llama-3-70B-Instruct base model using the TIES merging technique.

Key Capabilities

Domain Expertise: Deep understanding of SEC filings and related financial data.
Financial Analysis: Supports in-depth investment analysis, risk management, and regulatory compliance.
Hybrid Training: Combines domain-specific knowledge with general language understanding by mixing SEC data with 1 billion tokens from Together AI's RedPajama dataset.
Robust Evaluation: Assessed on domain-specific perplexity and extractive numerical reasoning tasks (TAT-QA, ConvFinQA), alongside general benchmarks like BIG-bench and AGIEval.
Scalable Training: Developed using Megatron-Core on an AWS SageMaker HyperPod cluster with H100 GPUs.

Good For

Financial Professionals: Ideal for investment analysis, risk assessment, and ensuring regulatory compliance.
Researchers: Useful for studying corporate governance and market trends within the financial sector.
SEC Data Analysis: Provides powerful natural language processing tailored to the specific needs of SEC filings and related financial information.

This model is an initial checkpoint, with further training planned to reach 70 billion tokens of SEC data, aiming for enhanced performance and reliability.

Overview

Llama-3-SEC-Base: Domain-Specific Financial Analysis

Key Capabilities

Good For

Full Model Card (README)