Name: OPI-PG/Qra-7b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: OPI-PG

OPI-PG/Qra-7b: A Foundation Model for Polish Language Processing

OPI-PG/Qra-7b is a 7 billion parameter large language model developed through a collaboration between the National Information Processing Institute (OPI) and Gdańsk University of Technology (PG). This model is adapted from Llama 2-7b-hf and has been extensively trained on a meticulously cleaned, filtered, and deduplicated corpus of approximately 90 billion Polish tokens, primarily sourced from web data including CommonCrawl and MADLAD-400.

Key Characteristics & Training

Polish Language Focus: Specifically designed and trained for the Polish language, making it highly proficient in Polish text generation and understanding.
Robust Preprocessing: The training data underwent rigorous preprocessing, including text normalization, URL removal, document filtering based on length and quality classifiers, language identification, and fuzzy deduplication within 18 topical domains.
Technical Optimizations: Trained for one epoch on 4096-token sequences, utilizing advanced optimizations such as torch.compile, adamw_apex_fused optimizer, Flash Attention 2, mixed precision, gradient accumulation, and FSDP.

Performance & Evaluation

Qra-7b demonstrates strong performance in perplexity benchmarks on Polish texts:

PolEval-2018: Achieved a perplexity of 11.3, significantly outperforming other Polish models like szymonrucinski/Curie-7B-v1 (13.5) and English models like meta-llama/Llama-2-7b-hf (24.3).
Long Documents (2024): Showed a perplexity of 4.5 on a new dataset of long Polish documents (news and scientific articles from 2024), surpassing szymonrucinski/Curie-7B-v1 (4.8) and meta-llama/Llama-2-7b-hf (5.9).

Important Note

Qra models are foundation language models trained with a causal language modeling objective. They are not intended for conversational or instruction-following purposes out-of-the-box and require further fine-tuning for such applications.

Overview

OPI-PG/Qra-7b: A Foundation Model for Polish Language Processing

Key Characteristics & Training

Performance & Evaluation

Important Note

Full Model Card (README)