EREN121232/FINSTROM-AI-V1.5
FINSTROM-AI-V1.5 by EREN121232 is a 1.5 billion parameter causal language model based on the Qwen2 architecture, featuring a 32768 token context length. It is provided with both Transformers weights and a GGUF build, making it suitable for local inference environments like Ollama, llama.cpp, and LM Studio. This model is designed for flexible deployment on local machines, offering a higher quality F16 GGUF build for robust performance.
Loading preview...
FINSTROM-AI-V1.5 Overview
FINSTROM-AI-V1.5 is a 1.5 billion parameter causal language model developed by EREN121232, built upon the Qwen2 architecture. It is specifically designed for efficient local inference, providing both standard Transformers weights (model.safetensors) and a GGUF build (finstrom-ai-v1.f16.gguf). This dual offering ensures compatibility with a wide range of local runtime environments, including Ollama, llama.cpp, and LM Studio.
Key Features & Capabilities
- Architecture: Qwen2-style causal language model.
- Parameter Count: 1.5 billion parameters.
- Context Length: Supports a maximum context of 32768 tokens, though practical local deployment may use a reduced context for memory efficiency.
- Local Inference Optimized: Provided with a GGUF build, facilitating easy integration and execution on personal hardware.
- Quality: The GGUF file is in F16 format, offering higher quality inference compared to more heavily quantized builds like Q4/Q5, albeit with a larger file size.
- Deployment Flexibility: Includes
tokenizer.json,tokenizer_config.json, andchat_template.jinjafor consistent tokenization and chat formatting across platforms.
Ideal Use Cases
- Local Development: Excellent for developers needing to run a capable language model directly on their machines without cloud dependencies.
- Experimentation: Suitable for experimenting with LLMs in environments like Ollama, llama.cpp, or LM Studio.
- Applications Requiring Higher Fidelity: The F16 GGUF build is beneficial for applications where a balance between local performance and output quality is desired, provided sufficient local memory is available.