Esperanto/Protein-Llama-3-8B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kArchitecture:Transformer0.0K Warm

Esperanto/Protein-Llama-3-8B is a specialized 8 billion parameter Llama-3-8B model, continually pre-trained using LoRA on extensive protein sequence datasets. This model is fine-tuned for protein language modeling, enabling the generation of novel protein sequences from natural language prompts. It supports both uncontrollable and controllable protein generation, making it a valuable tool for advancements in protein engineering, drug development, and biotechnological applications.

Loading preview...

What is Protein-Llama-3-8B?

Protein-Llama-3-8B is a specialized 8 billion parameter language model based on the Llama-3-8B architecture, continually pre-trained using LoRA. Its primary function is protein language modeling, allowing for the generation of novel protein sequences. This model significantly accelerates protein engineering by leveraging an LLM to generate and evaluate protein sequences rapidly, expanding possibilities beyond traditional, labor-intensive methods.

Key Capabilities

  • Novel Protein Sequence Generation: Generates new protein sequences based on natural language prompts.
  • Controllable Generation: Supports specifying desired protein characteristics, including 10 different protein family classes (e.g., Ligase enzyme protein).
  • Uncontrollable Generation: Capable of generating diverse protein sequences without specific constraints.
  • Accelerated Protein Engineering: Streamlines the discovery and development process in biotechnological applications.

Good For

  • Drug Development: Designing proteins with specific therapeutic properties.
  • Chemical Synthesis: Creating novel proteins for industrial or research applications.
  • Biotechnological Research: Exploring new protein functions and structures.
  • Expanding Protein Diversity: Generating proteins with unprecedented functions beyond existing templates.

For more in-depth information, refer to the associated research paper: Energy Efficient Protein Language Models: Leveraging Small Language Models with LoRA for Controllable Protein Generation.