sapoepsilon/whispera-voice-commands

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:0.5BQuant:BF16Ctx Length:32kPublished:Jan 27, 2026License:mitArchitecture:Transformer0.0K Open Weights Warm

sapoepsilon/whispera-voice-commands is a small, MLX-compatible model fine-tuned to convert natural-language macOS commands into structured JSON output. This model is specifically designed for voice command interpretation, translating spoken instructions into machine-readable JSON objects. Its primary strength lies in its ability to parse voice-like commands and format them for subsequent execution via shell commands, making it ideal for voice-controlled interfaces.

Loading preview...

Whispera Voice Commands (MLX)

This model, developed by sapoepsilon, is a specialized MLX-compatible language model designed for interpreting natural-language macOS commands. It takes voice-like inputs and converts them into structured JSON objects, which can then be mapped to real shell commands using predefined templates.

Key Capabilities

  • Natural Language to JSON Conversion: Translates spoken or text-based macOS commands (e.g., "open safari") into a structured JSON format like {"category":"apps","operation":"open","app":"chrome"}.
  • MLX Compatibility: Optimized for use within the MLX framework, ensuring efficient processing on Apple silicon.
  • Fused Model: Provided as a convenient fused model, combining the base model with LoRA adapters for ease of deployment.
  • Structured Output: Guarantees JSON-only outputs, simplifying parsing and integration into command execution systems.

Good For

  • Voice Control Systems: Ideal for building or enhancing voice-controlled interfaces for macOS applications.
  • Automating macOS Tasks: Enables the automation of common macOS operations through natural language input.
  • Developers requiring structured command parsing: Provides a robust solution for converting free-form commands into actionable, structured data.

This model is a component of a larger project aimed at creating a comprehensive voice command system for macOS, with further details available in the main project repository.