RedHatAI/Llama-3.1-8B-tldr

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kLicense:llama3.1Architecture:Transformer0.0K Warm

RedHatAI/Llama-3.1-8B-tldr is an 8 billion parameter LlamaForCausalLM model developed by Red Hat (Neural Magic), fine-tuned specifically for summarizing text in the style of Reddit posts. It achieves a BERTScore of 0.366 on the trl-lib/tldr dataset, demonstrating strong performance in generating concise summaries. This model is optimized for text summarization tasks, particularly for content resembling Reddit discussions.

Loading preview...

Model Overview

RedHatAI/Llama-3.1-8B-tldr is an 8 billion parameter LlamaForCausalLM model, fine-tuned by Red Hat (Neural Magic) from the meta-llama/Llama-3.1-8B base model. Its primary purpose is to generate "TL;DR" (Too Long; Didn't Read) style summaries for Reddit posts, leveraging the trl-lib/tldr dataset for its specialized training.

Key Capabilities

  • Reddit-style Summarization: Specifically trained to condense Reddit posts into short, digestible summaries.
  • Performance: Achieves a BERTScore of 0.366 on the trl-lib/tldr test set, alongside ROUGE-1 of 0.362, ROUGE-2 of 0.144, and ROUGE-Lsum of 0.306.
  • Efficient Deployment: Designed for efficient deployment using vLLM, with provided examples for server setup and querying.

Training Details

The model was trained using axolotl with a sequence length of 4096 and flash_attention enabled. It underwent 3 epochs of training with an AdamW optimizer and a learning rate of 1e-5.

Inference Performance

Benchmarking with vLLM and GuideLLM demonstrates its inference performance, including comparisons with dense-quantized and sparse-quantized variants, highlighting its efficiency in generating summaries.