Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: VibeWhisper – macOS voice-to-text with push-to-talk,cloud or 100% local (vibewhisper.dev)
2 points by AleksDoesCode 61 days ago | hide | past | favorite | 1 comment
Hey HN,

I built VibeWhisper because I was using Claude Code all day and the bottleneck wasn't thinking — it was typing. I tried an existing voice-to-text tool, loved the workflow, then saw the price: $20/month for a thin wrapper around OpenAI Whisper where 10 minutes of API usage costs $0.06. So I built my own in a weekend.

How it works: Hold a configurable global hotkey (default: Right Command), speak, release. Transcribed text appears in whatever text field has focus — VS Code, Terminal, Slack, browser, anything. No clipboard hijacking — it uses the macOS Accessibility API to inject text directly at the cursor position (same APIs screen readers use).

Two transcription engines:

Cloud: OpenAI Whisper API via your own key (~$0.006/min, fastest and most accurate)

Local: WhisperKit running on-device via CoreML/Apple Neural Engine. 10 models from Tiny (50 MB) to Large V3 Turbo (1.6 GB). Your voice never leaves your Mac. No API key needed, no internet needed. The app auto-recommends the best local model for your hardware.

Tech details for the curious:

Native Swift/SwiftUI, no Electron. Uses less RAM than a Chrome tab.

CGEvent taps for global keyboard hooks

Accessibility API for text injection

API key stored in macOS Keychain

Universal binary (arm64 + x86_64), macOS 14+

No backend, no accounts, no telemetry

Pricing: $19 one-time. No subscription. 15-minute free trial (actual recording time, not calendar time), no credit card, no signup. If you use cloud mode, you pay OpenAI directly at their rate. If you use local mode, it costs nothing after purchase.

I use it myself 2-3 hours a day. My main use case is dictating long prompts to AI coding assistants — I can speak a 400-word prompt in 45 seconds instead of typing it in 3-4 minutes. I also use it as a thinking tool: talking through problems out loud and capturing the output.

Website: https://vibewhisper.dev

Happy to answer any technical questions.



[flagged]


Hey yarivk,

thanks I do appreciate it! The local mode depends heavily on the hardware and the model you choose.

I run it exclusively in local mode nowadays. Old Macbook Pro with M1 chip with 32g ram + Model Large v3 Turbo.

Transcribing one minute of audio takes around 2-3 seconds, compared to the 0.5 to 1 second if using the openAI API.

For my usecase this a no brainer. Not having to pay + keeping all of my data private is well worth waiting 1.5 seconds longer. Also I simply think it is pretty cool to run models locally, so there that :D

Feel free to try it out completely for free and lmk your thoughts!




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: