gsar78/GreekLlama-1.1B-base

TEXT GENERATIONConcurrency Cost:1Model Size:1.1BQuant:BF16Ctx Length:2kLicense:apache-2.0Architecture:Transformer0.0K Open Weights Gated Cold

gsar78/GreekLlama-1.1B-base is a 1.1 billion parameter base model, based on a custom Llama-like architecture, developed by gsar78. It was pre-trained on a Wikipedia corpus with a 60/40 English to Greek language ratio for 1 billion tokens. This small, experimental model supports both Greek and English languages, primarily intended for research and development purposes.

Loading preview...

GreekLlama-1.1B-base: An Experimental Bilingual Model

GreekLlama-1.1B-base is a compact, experimental language model developed by gsar78, featuring 1.1 billion parameters and built upon a custom Llama-like architecture. It is designed to support both Greek and English languages.

Key Capabilities

  • Bilingual Support: Processes text in both Greek and English, with training data weighted towards English (60%) over Greek (40%).
  • Llama-like Architecture: Utilizes a custom architecture inspired by the Llama family of models.
  • Small Footprint: With 1.1 billion parameters, it is a relatively small model, making it suitable for resource-constrained environments or initial experimentation.

Training Details

The model was pre-trained on a Wikipedia corpus for approximately 1 billion tokens. The developers note that this training duration is below the optimal point, indicating that the model's performance is not expected to be high and it is primarily intended for experimental purposes rather than production use.

Good For

  • Bilingual Research: Exploring the behavior of small, bilingual models in Greek and English.
  • Educational Use: Understanding basic LLM architectures and training processes.
  • Resource-Constrained Applications: Where a larger, more performant model is not feasible or necessary.
  • Early-stage Prototyping: For tasks that do not require high accuracy or extensive linguistic nuance.