nayohan/llama3-8b-it-prometheus-ko

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kLicense:llama3Architecture:Transformer0.0K Warm

The nayohan/llama3-8b-it-prometheus-ko model is an 8 billion parameter Llama 3 instruction-tuned language model with an 8192 token context length. Developed by nayohan, it is specifically fine-tuned on a Korean-translated version of the Prometheus Feedback-Collection dataset. This model excels at evaluating responses based on given rubrics, providing detailed feedback and a numerical score, making it suitable for automated content assessment and quality control in Korean.

Loading preview...

Overview

This model, nayohan/llama3-8b-it-prometheus-ko, is an 8 billion parameter Llama 3 instruction-tuned language model. It was developed by nayohan by translating the prometheus-eval/Feedback-Collection dataset into Korean and then fine-tuning the base Llama 3-8B-IT model on this Korean dataset. The primary purpose of this model is to provide detailed feedback and assign a numerical score (1-5) to a given response based on a provided instruction, reference answer, and a score rubric.

Key Capabilities

  • Automated Evaluation: The model can evaluate responses against specific criteria, generating both qualitative feedback and a quantitative score.
  • Korean Language Support: Fine-tuned specifically for Korean, it processes and generates feedback in Korean.
  • Rubric-Based Assessment: It strictly adheres to a given score rubric, ensuring evaluations are consistent and focused on predefined criteria.
  • Flexible Input: Can perform evaluations with or without a reference answer, adapting to different assessment scenarios.

Use Cases

  • Content Quality Assurance: Automatically assess the quality of generated text, customer service responses, or educational content in Korean.
  • Feedback Generation: Provide structured and detailed feedback for various text-based tasks.
  • Research and Development: Useful for researchers working on automated evaluation systems for Korean language models.

Performance Insights

Evaluation on a 200-sample Korean test set (derived from nayohan/feedback-collection-ko-chat) shows its learning capability, with a heatmap visualization indicating its performance in aligning generated scores with correct answers. The model demonstrates strong adherence to the evaluation framework it was trained on.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p