You should use Open Whisper
open-source speech-to-text transcription software
Previously, I said you should use SuperWhisper. I should’ve said “you should use speech-to-text transcription software.” Simon Lermen has built Open Whisper, which I modded (code here) and tried out today.1
I’ve cancelled my SuperWhisper subscription and will use modded Open Whisper2. Here’s what pops up when you run ./start_desktop.sh:
Open Whisper is now functionally equivalent to SuperWhisper for me, while also being free and open-source, so we can continually add further features.
Speech-to-Text Programming Languages
I was curious about speech-to-text programming languages. It looks like these haven’t really taken-off, because there exist control interfaces that allow you write Python, JS, or whatever it might be with your voice.
In Talon, you could say:
and out would come
def compute_stats()
Of course, the hottest new programming language is English.
Coming soon:
The ‘Era of Experience’ lens on ‘dense reconstruction’
A review of Claude Computer Use, OpenAI Operator, DeepMind’s Project Mariner, and ByteBot, with a focus on speech-to-text-based continual execution (subject to budget / subscription constraints)5
Voice shell for personal computer use — it will feature commands for web search, personal database search, calendar scheduling, task management, etc.. It will be API-enabled and better than Siri6
Incidentally, this is one of the coolest things that’s happened to me while blogging so far: blogging about my suboptimal stack has led to software I’m going to use being created, and I’ll save $10/month on the SuperWhisper subscription. Thanks so much, Simon! (And feel free to send me your Venmo).
Here are the modifications I made:
Full-length transcripts: I faced an issue where recordings would prematurely stop, producing recordings of e.g. ‘0.02s’. Now, the app waits for you to ‘Stop recording’, then transcribes your entire recording.
Automatic saving: Previously, you had to manually click ‘Save’ to save any transcript. Now, each transcript autosaves to the ‘transcripts’ folder, because no one has time for button-pushing.
Global hotkeys: Previously, you had to have Open Whisper selected as active window for hotkeys to work. Now we have global hotkey functionality: you can hit Cmd+R anywhere to record, as long as Open Whisper is running in the background.
Transcription rate: I found optimizations that sped up transcription ~10x with negligible cost to transcription faithfulness.
Hotkey-UI integration: Sometimes, you’d see hotkey-triggered status updates in terminal but not UI; I fixed this.
General UI fixes: The backend used to finish transcribing but never clear the “Transcribing...” state in UI; I fixed this.
One transcript per recording: By default, multiple recordings would accumulate on the same transcript, which isn’t great for my use case. Now each recording gets a separate transcript by default.
create function
snake_case
One can only hope blogging about this will deliver a custom build to my inbox.
Ditto ^




I've used Talon a lot and the platform it provides is pretty great. Very happy and grateful that it exists.
However-- and I may be 1-2 years out of date on this-- in my experience, the documentation is terrible, and IMO, the default talonhub/community command set isn't very good either.
Cursorless is very cool, but only if you don't get eyestrain. I had much better results after developing a custom set of editing commands for myself like the following:
lurk:
key(ctrl-shift-left)
rug:
key(ctrl-shift-right)
lark:
key(alt-shift-left)
ran:
key(alt-shift-right)
I also switched to a directional boom mic instead of the DPA that everyone in the community recommends.