ChangeIsKey/llama3-janus

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kTool Calling:SupportedPublished:Sep 30, 2024Architecture:Transformer Cold

Janus is an 8 billion parameter Llama 3 model developed by Pierluigi Cassotti and Nina Tahmasebi at the University of Gothenburg. Fine-tuned on over 1.2 million sense-annotated historical usages from the Oxford English Dictionary, it specializes in generating historically and semantically accurate word usages for a given word, sense definition, and year. This model is optimized for applications in semantic change detection, historical NLP, and linguistic research, providing example sentences reflecting linguistic usage from 1700-2020.

Loading preview...

Janus: Historical Word Usage Generation Model

Janus is an 8 billion parameter model built upon Meta Llama 3, developed by Pierluigi Cassotti and Nina Tahmasebi at the University of Gothenburg. Its core function is to generate historically and semantically accurate example sentences for a given word, its sense definition, and a specified year.

Key Capabilities

  • Historical Usage Generation: Produces example sentences reflecting linguistic usage from 1700 to 2020.
  • Semantic Accuracy: Generates usages comparable to Oxford English Dictionary (OED) test data in human evaluations.
  • Temporal Accuracy: Achieves a Root Mean Squared Error (RMSE) of approximately 52.7 years against OED ground truth for temporal relevance.
  • Context Variability: Maintains low lexical repetition, preserving natural linguistic diversity in generated text.

Training and Data

Janus was fine-tuned using QLoRA on a dataset of over 1.2 million sense-annotated historical usages extracted from the Oxford English Dictionary (OED), covering the period from 1700 to 2020.

Good For

  • Semantic Change Detection: Investigating the evolution of word meanings over time.
  • Historical NLP: Enhancing the understanding and processing of historical texts.
  • Linguistic Research: Generating sense-annotated corpora for various linguistic studies.

Limitations

Users should be aware of potential historical biases present in the training data, approximate temporal resolution (~50 years RMSE), and the possibility of generating modern phrases in older contexts. The model has not been explicitly trained for fairness or bias mitigation.