FemkeBakker/AmsterdamDocClassificationLlama200T1Epochs
The FemkeBakker/AmsterdamDocClassificationLlama200T1Epochs model is a 7 billion parameter Llama-2-7b-chat-hf variant, fine-tuned by FemkeBakker for document classification. Developed in collaboration with the Municipality of Amsterdam, this model specializes in classifying documents truncated to their first 200 tokens. It was fine-tuned for one epoch on a balanced dataset, achieving a validation loss of 0.8403.
Loading preview...
AmsterdamDocClassificationLlama200T1Epochs Overview
This model is a 7 billion parameter Llama-2-7b-chat-hf variant, fine-tuned by FemkeBakker as part of the Assessing Large Language Models for Document Classification project by the Municipality of Amsterdam. Its primary purpose is document classification.
Key Characteristics
- Base Model: Fine-tuned from meta-llama/Llama-2-7b-chat-hf.
- Specialization: Optimized for classifying documents based on their initial 200 tokens.
- Training Data: Utilizes the AmsterdamBalancedFirst200Tokens dataset, consisting of 9900 training documents and 1100 evaluation documents, all formatted as conversations.
- Training Epochs: Fine-tuned for a single epoch, with a total training time of 39 minutes.
- Performance: Achieved a validation loss of 0.8403 on the evaluation set.
- Hyperparameters: Key hyperparameters include a learning rate of 1e-05, a
train_batch_sizeof 2, andgradient_accumulation_stepsof 8.
Use Cases
This model is particularly suited for tasks requiring efficient document classification where only the initial segment of a document is relevant or available for analysis. It can be a valuable tool for organizations like the Municipality of Amsterdam in categorizing incoming documents quickly.