Name: FemkeBakker/AmsterdamDocClassificationLlama200T3Epochs API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: FemkeBakker

Model Overview

This model, AmsterdamDocClassificationLlama200T3Epochs, is a 7 billion parameter variant of meta-llama/Llama-2-7b-chat-hf, fine-tuned by FemkeBakker. It was developed as part of the "Assessing Large Language Models for Document Classification" project by the Municipality of Amsterdam. The fine-tuning focused on document classification using the AmsterdamBalancedFirst200Tokens dataset, which comprises documents truncated to their initial 200 tokens.

Key Capabilities

Specialized Document Classification: Fine-tuned specifically for classifying documents based on their first 200 tokens.
Llama-2 Base: Leverages the robust architecture of Llama-2-7b-chat-hf.
Optimized Training: Underwent three epochs of fine-tuning, achieving a validation loss of 0.8116.

Training Details

The model was trained on 9900 documents and evaluated on 1100 documents, all formatted as conversations. Training hyperparameters included a learning rate of 1e-05, a train_batch_size of 2, and gradient_accumulation_steps of 8, resulting in a total batch size of 16. The training process took approximately 2 hours and 3 minutes. Further specifics and code can be found on the GitHub repository.

Good For

Short Document Classification: Ideal for use cases where classification decisions can be made based on the initial segments of documents.
Research in Document Classification: Suitable for researchers exploring the effectiveness of LLMs for document classification, particularly with truncated inputs.
Amsterdam Municipality Projects: Directly relevant for applications within the Municipality of Amsterdam's document processing workflows.

Overview

Model Overview

Key Capabilities

Training Details

Good For

Full Model Card (README)