JetBrains/Mellum-4b-sft-kotlin
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:May 19, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

Mellum-4b-sft-kotlin is a 4 billion parameter LLaMA-style language model developed by JetBrains, fine-tuned specifically for code-related tasks. Pre-trained on over 4 trillion tokens with an 8192-token context window, it excels at code completion in Kotlin. This model is optimized for integration into professional developer tooling, AI-powered coding assistants, and research on code understanding and generation.

Loading preview...

Overview

JetBrains' Mellum-4b-sft-kotlin is a 4 billion parameter, LLaMA-style large language model (LLM) that has been fine-tuned for code-related tasks, with a particular emphasis on Kotlin. It was pre-trained on over 4 trillion tokens and features an 8192-token context window, making it efficient for handling substantial code snippets.

Key Capabilities

  • Code Completion: Specifically optimized for generating and completing Kotlin code.
  • Code Understanding: Designed to assist with research in code understanding and generation.
  • Efficient Deployment: Its 4 billion parameters allow for efficient inference in cloud environments (e.g., vLLM) and local deployment (e.g., llama.cpp, Ollama).
  • Fill-in-the-Middle (FIM): Supports FIM capabilities, allowing for code generation within existing code structures, as demonstrated in its sample usage.

Use Cases

  • Developer Tooling: Ideal for integration into Integrated Development Environments (IDEs) to provide intelligent code suggestions.
  • AI-Powered Coding Assistants: Suitable for building tools that assist developers with coding tasks.
  • Educational Applications: Can be used for teaching and learning programming concepts.
  • Research & Fine-tuning: Serves as a strong base for further research into code LLMs and for fine-tuning experiments on specific code domains.

Limitations

  • Bias: May reflect biases present in its training data from public codebases, potentially leading to code styles similar to open-source repositories.
  • Security: Generated code suggestions should not be considered inherently secure or free of vulnerabilities.