Hacker news

Top
New
Past
Ask
Show
Jobs

Handy – Free open source speech-to-text app (https://github.com)

243 points by tin7in 3 days ago | 107 comments | View on ycombinator

d4rkp4ttern 3 days ago |

I’ve tried several, including this one, and I’ve settled on VoiceInk (local, one-time payment), and with Parakeet V3 it’s stunningly fast (near-instant) and accurate enough to talk to LLMs/code-agents, in the sense that the slight drop in accuracy relative to Whisper Turbo3 is immaterial since they can “read between the lines” anyway.

My regular cycle is to talk informally to the CLI agent and ask it to “say back to me what you understood”, and it almost always produces a nice clean and clear version. This simultaneously works as confirmation of its understanding and also as a sort of spec which likely helps keep the agent on track.

UPDATE - just tried handy with Parakeet v3, and it works really well too, so I'll use this instead of VoiceInk for a few days. I just also discovered that turning on the "debug" UI with Cmd-shift-D shows additional options like post processing and appending trailing space.

blutoot 3 days ago |

I have dystonia which often stiffens my arms in a way that makes it impossible for me to type on a keyboard. TTS apps like SuperWhisper have proven to be very helpful for me in such situations. I am hoping to get a similar experience out of "Handy" (very apt maming from my perspective).

I do, however, wonder if there is a way all these TTS tools can get to the next level. The generated text should not be just a verbatim copy of what I just said, but depending on the context, it should elaborate. For example, if my cursor is actively inside an editor/IDE with some code, my coding-related verbal prompts should actually generate the right/desired code in that IDE.

Perhaps this is a bit of combining TTS with computer-use.

kuatroka 3 days ago |

Love it. I had been searching for STT app for weeks. Every single app was either paid as a one off or had a monthly subscription. It felt a bit ridiculous having to pay when it’s all powered by such small models on the back end. So I decided to build my own. But then I found “Handy” and it’s been a really amazing partner for me. Super fast, super simple, doesn’t get in my way and it’s constantly updated. I just love it. Thanks a lot for making it! Thanks a lot

P.S. The post processing that you are talking about, wouldn’t it be awesome.

frankdilo 3 days ago |

This looks great! What’s missing for me to switch from something like Wispr Flow is the ability to provide a dictionary for commonly mistaken words (name of your company, people, code libraries).

Barbing 3 days ago |

Quick thoughts re: mentioned transcribers

Superwhisper — Been using it a long time. It's paid with a lifetime subscription available. Tons of features. Language models are built right in without additional charge. Solo dev is epic; may defer upgrades to avoid occasional bugs/regressions (hey, it's complex software).

Trying each for a few minutes:

Hex — Feels the leanest (& cleanest) free options mentioned for Mac in this thread.

Fluid Voice — Offers a unique feature, a real-time view of your speech as you talk! Superwhisper has this, but only with an online model. (You can't see your entire transcript in Fluid, though. The recording window view is limited to about one sentence at a time--of course you do see everything when you complete your dictation.)

Handy — Pink and cute. I like the history window. As far as clipboard handling goes, I might note that the "don't modify clipboard" setting is more of a "restore clipboard" setting. Though it doesn't need as many permissions as Hex because it's willing to move clipboard items around a bit, if I'm not mistaken.

Note Hex seems to be upset about me installing all the others... lots of restarting in between installs all around. Each has something to offer.

---

Big shout out to Nvidia open-sourcing Parakeet--all of these apps are lightning fast.

Also I'm partial to being able to stream transcriptions to the cursor into any field, or at least view live like Fluid (or superwhisper online). I know it's complex b/c models transcribe the whole file for accuracy. (I'm OK with seeing a lower quality transcript realtime and waiting a second for the higher-quality version to paste at the end.)

mncharity 3 days ago |

A cautionary user experience report. The default hotkey upon download is ctrl+space. Press to begin recording, release to transcribe and insert. Key-up on the space key constitutes hotkey release. If the ctrl key is still down when the insertion lands, the transcribed text is treated as ctrl characters. The test app was emacs. (x64 linux x11, with and without xdotool)

PhilippGille 3 days ago |

Has anyone compared this with https://github.com/HeroTools/open-whispr already? From the description they seem very similar.

Handy first release was June 2025, OpenWhispr a month later. Handy has ~11k GitHub stars, OpenWhispr has ~730.

aucisson_masque 3 days ago |

It’s incredibly fast on my MacBook m1 air and more accurate that the native speech to text.

The ui is well thought out, just the right amount of setting for my usage.

Incredible !

Btw, do you know what « discharging the model » does ? It’s set to never by default, tried to check if it has an impact on ram or cpu but it doesn’t seem to do anything.

peterldowns 3 days ago |

Huge fan! Parakeet v3 works great with it. I have used Monologue, Superwhisper, and Aqua, at various times in the past. But Handy is at least as good, and it's not an expensive subscription. I love that it runs locally, too. Strongly recommend!

Jack5500 3 days ago |

The Parakeet V3 model is really great!

Jayakumark 3 days ago |

Its great, i have been using it . Two requests though 1. iOS app 2. API option to use against meeting transcription or route audio from Mic .

holtwick 3 days ago |

FluidVoice for macOS is pretty handy as well. Open source under Apache License. https://altic.dev/fluid https://github.com/altic-dev/FluidVoice

llarsson 3 days ago |

A question because I'm not using speech-to-text, but find it intriguing (especially since it's now possible to do locally and for free).

How have your computing habits changed as a result of having this? When do you typically use this instead of typing on the keyboard?

dumbmrblah 3 days ago |

I just set this up today. I had Whispering app set up on my Windows computer, but it really wasn't working well on my Ubuntu computer that I just set up. I found Handy randomly. It was the last app I needed to go Linux full-time. Thank you!

unutranyholas 3 days ago |

https://hex.kitlangton.com/ is good

wi5eif6E 3 days ago |

This looks and works great! A settings option to keep no recording history at all would be terrific.

vladstudio 3 days ago |

Use it daily. Looks and works great.

erelong 3 days ago |

WhisperTux on linux worked ok, curious how Handy compares: https://github.com/cjams/whispertux

mrroryflint 3 days ago |

On a M4 Macbook Air, there was enough lag to make it unusable for me. I hit the shortcut and start speaking but there was always a 1-2sec delay before it would actually start transcribing even if the icon was displayed.

miniwark 3 days ago |

Did this thing (or open-whispr) work well with other languages than english ?

walthamstow 3 days ago |

Nice. I spent most of Christmas vibe coding with Google Antigravity with one hand while holding a sleeping baby in the other. MacOS built in dictation is OK, but struggles with technical language.

qprofyeh 3 days ago |

As a Mac user, am I missing something? macOS has Dictation built-in, when you short press F5 it should start transcribing your spoken words into text in real time. It even does non-English languages.

mnmalst 3 days ago |

This is really cool. Works out of the box and I'm typing this using handy.

Is there any way to execute commands directly on Linux?

Also a feature to edit or correct already typed text would be really great.

oybng 3 days ago |

On Windows this depends on webview2, which the installer attempts to download. No mention of this requirement in the readme. It's a shame this software isn't portable

chainmail2029 3 days ago |

There's a slightly awkward naming overlap with an existing product.

bn-usd-mistake 3 days ago |

Does anyone have a similar mobile application that works locally and is not too expensive? Mostly looking to transcribe voice messages sent over Signal which does not offer this OOTB

jborichevskiy 3 days ago |

Big Handy fan!

swordsith 2 days ago |

from the read-me, 'Handy isn't trying to be the best speech-to-text app—it's trying to be the most forkable one.' Why cant we write a readme without using generative AI, seriously, it's not that hard. :<

skor 3 days ago |

This is so handy, thank you very much. Good work!!

dotancohen 3 days ago |

Looks interesting. Why does it need a GUI at all?

ekjhgkejhgk 3 days ago |

Explain to me why a speech-to-text app has 50% of its code in typescript...?

fittingopposite 3 days ago |

Is there any good android app featuring parakeet v3?

Dnguyen 3 days ago |

Would be nice if the output can be piped directly into Claude Code.

laylower 3 days ago |

Is it deployed locally or does it send data to your servers?

blutoot 3 days ago |

Crashes on Tahoe 26.3 Betq 1 :(

sirjaz 3 days ago |

This is great, and I love that this is not another webapp

atay123 2 days ago |

[dead]

olya_pllkh 2 days ago |

[dead]