[{"data":1,"prerenderedAt":584},["ShallowReactive",2],{"blog-index-en":3},[4,205,345],{"id":5,"title":6,"body":7,"category":172,"date":173,"description":174,"draft":175,"extension":176,"faq":177,"image":190,"meta":191,"navigation":192,"path":193,"seo":194,"stem":195,"tags":196,"translationKey":202,"updated":203,"__hash__":204},"blogEn\u002Fen\u002Fblog\u002Fspeech-dock-vs-whisper-cpp.md","Speech Dock vs Whisper.cpp: Finished App or Bare Engine?",{"type":8,"value":9,"toc":162},"minimark",[10,14,17,22,25,28,32,39,42,67,70,74,77,101,104,108,131,134,138,141,145,148,151,154],[11,12,13],"p",{},"If you have been looking for local speech recognition, you have almost certainly come across Whisper.cpp. And you may have caught yourself thinking: \"Why pay for an app when there is a free engine?\" It is a fair question, but it hides a sleight of hand. Whisper.cpp and a finished dictation app live in different categories. Arguing over which one is \"better\" is a bit like arguing whether an engine or a car is better.",[11,15,16],{},"No made-up performance numbers ahead. Let us break down what is what, what you will have to do by hand, and how to pick the right option for your needs.",[18,19,21],"h2",{"id":20},"what-whispercpp-is-and-what-it-is-for","What Whisper.cpp Is and What It Is For",[11,23,24],{},"Whisper.cpp is a well-respected open-source project, an efficient implementation of speech recognition in C\u002FC++. It runs locally, with no cloud, and is nicely optimized for ordinary hardware. It is excellent engineering work, and its popularity is well earned.",[11,26,27],{},"But it is an engine. A library plus a command-line tool that take audio and produce text. There is one thing Whisper.cpp does beautifully: it recognizes speech. Everything else that turns recognition into convenient dictation sits outside its scope. And for an engine that is perfectly fine, that is the whole point.",[18,29,31],{"id":30},"an-engine-is-not-yet-a-dictation-tool","An Engine Is Not Yet a Dictation Tool",[11,33,34],{},[35,36],"img",{"alt":37,"src":38},"A recognition engine versus a finished dictation app: what each option includes","\u002Fblog\u002Finfographics\u002Fengine-vs-app.en.png",[11,40,41],{},"When you dictate in your day-to-day work, recognition is just one step out of many. For your voice to turn into text in the right field, several things have to happen at once:",[43,44,45,49,52,55,58,61,64],"ul",{},[46,47,48],"li",{},"capturing sound from the microphone in real time;",[46,50,51],{},"starting and stopping recording with a convenient hotkey from any app;",[46,53,54],{},"the speech recognition itself (this is where the engine does its job);",[46,56,57],{},"formatting the text: punctuation, a readable layout;",[46,59,60],{},"inserting the result into the active window, be it an editor, a messenger, or a browser;",[46,62,63],{},"a history of your recordings, so you can return to what you dictated;",[46,65,66],{},"managing language data and updates.",[11,68,69],{},"The engine covers one item on this list. A finished app covers them all and ties them into a single process you never have to think about.",[18,71,73],{"id":72},"what-you-will-have-to-build-yourself-on-top-of-the-engine","What You Will Have to Build Yourself on Top of the Engine",[11,75,76],{},"Building a dictation tool on top of Whisper.cpp is realistic, and as a learning project it is even worthwhile. But consider the scope.",[78,79,80,83,86,95,98],"ol",{},[46,81,82],{},"Audio capture and streaming. The engine does not listen to a live microphone on its own; you have to set that up.",[46,84,85],{},"Hotkeys and background mode. To dictate from any app, you need a global hotkey and a service running in the background.",[46,87,88,89,94],{},"Text insertion. This is where the ",[90,91,93],"a",{"href":92},"\u002Fen\u002Fblog\u002Flinux-voice-input","differences between X11 and Wayland on Linux"," show up: auto-paste, the clipboard, detecting the active window. All of that is on you to handle.",[46,96,97],{},"Interface and feedback. A settings window, a recording indicator, feedback to the user.",[46,99,100],{},"Model management and per-platform builds. Downloading language data, compiling from source, supporting updates.",[11,102,103],{},"Nothing impossible here. But this is already developing and maintaining your own tool, not \"install it and use it.\"",[18,105,107],{"id":106},"what-a-finished-app-gives-you","What a Finished App Gives You",[11,109,110,111,115,116,120,121,125,126,130],{},"Speech Dock takes all of that plumbing off your hands. You install the app for ",[90,112,114],{"href":113},"\u002Fen\u002Finstall\u002Flinux","Linux"," or ",[90,117,119],{"href":118},"\u002Fen\u002Finstall\u002Fmacos","macOS",", assign a hotkey, and dictate into any window. Recognition runs ",[90,122,124],{"href":123},"\u002Fen\u002Fblog\u002Foffline-speech-recognition","locally, with no cloud",", so your voice never leaves your device. Privacy is covered in detail on a dedicated ",[90,127,129],{"href":128},"\u002Fen\u002Fprivacy","page",".",[11,132,133],{},"What you end up with is not an engine you have to \"finish,\" but a ready-made workflow: record, format, insert, history. Out of the box and tuned to the quirks of your particular system.",[18,135,137],{"id":136},"what-we-deliberately-are-not-comparing-here","What We Deliberately Are Not Comparing Here",[11,139,140],{},"Let me be blunt: this article does not claim that one option is \"faster\" or \"more accurate\" than the other. Any such comparison depends on the specific hardware, language, settings, and use case. Without reproducible measurements on your own machine, it turns into marketing noise. I am not comparing numbers; I am comparing categories of tools and the amount of work that lands on you.",[18,142,144],{"id":143},"when-to-choose-which","When to Choose Which",[11,146,147],{},"The engine (Whisper.cpp) is worth taking if you are a developer building your own product, or you have an unusual use case where you need full control over every step, and you are ready to build and maintain all the surrounding parts yourself.",[11,149,150],{},"The finished app (Speech Dock) is the right fit if you need convenient, private dictation right now, without compiling from source and fiddling with window-handling details by hand, and if you would rather focus on your work than on your tool.",[11,152,153],{},"Both options respect your privacy through local processing. The whole difference is how much engineering work you are willing to take on.",[11,155,156,157,161],{},"So the question is not who is \"more accurate\" than whom. The question is what suits you better: a kit you still have to assemble and then keep going, or a finished tool that simply works. Whisper.cpp is excellent in exactly its role as an engine, and it makes sense to sit down with it when you want full control and have the time to spare. But if control is not the goal and you just need dictation here and now, ",[90,158,160],{"href":159},"\u002Fen\u002Fdownload","download Speech Dock"," and dictate your first note today.",{"title":163,"searchDepth":164,"depth":164,"links":165},"",2,[166,167,168,169,170,171],{"id":20,"depth":164,"text":21},{"id":30,"depth":164,"text":31},{"id":72,"depth":164,"text":73},{"id":106,"depth":164,"text":107},{"id":136,"depth":164,"text":137},{"id":143,"depth":164,"text":144},"Comparison","2026-06-15","How a ready-made dictation app differs from a low-level engine like Whisper.cpp, what you would have to build yourself, and how to pick the right option for your needs.",false,"md",[178,181,184,187],{"question":179,"answer":180},"Is Speech Dock a wrapper around Whisper.cpp?","No, they belong to different categories of tools. Whisper.cpp is a low-level recognition engine that a developer embeds into their own solution. Speech Dock is a finished desktop dictation app with its own interface, hotkeys, text insertion, and history.",{"question":182,"answer":183},"Can I build my own dictation tool on top of Whisper.cpp?","Yes, and plenty of people do. But the engine only turns audio into text. For everyday dictation you still have to build everything around it yourself: microphone capture, hotkeys, inserting text into the active window, managing language data, and the interface. It is doable, but it is a project, not an app install.",{"question":185,"answer":186},"What should I choose if I am not a programmer?","If you need dictation here and now, without compiling and configuring from source, go with the finished app. The engine makes sense when you are building your own product or have an unusual use case that demands full control.",{"question":188,"answer":189},"Do both options work offline?","Yes, local speech recognition is possible either way. The difference is not \"cloud versus device\" but how much of the work to turn an engine into a convenient tool falls on you.","\u002Fog\u002Fblog\u002Fen\u002Fspeech-dock-vs-whisper-cpp.png",{},true,"\u002Fen\u002Fblog\u002Fspeech-dock-vs-whisper-cpp",{"title":6,"description":174},"en\u002Fblog\u002Fspeech-dock-vs-whisper-cpp",[197,198,199,200,201],"Whisper.cpp","speech recognition","comparison","offline","dictation","speech-dock-vs-whisper-cpp",null,"s5NUsGD1c2F0YlFkhxM1_eo2lnH9JLH2lQMLKKbKWOs",{"id":206,"title":207,"body":208,"category":322,"date":173,"description":323,"draft":175,"extension":176,"faq":324,"image":337,"meta":338,"navigation":192,"path":123,"seo":339,"stem":340,"tags":341,"translationKey":343,"updated":203,"__hash__":344},"blogEn\u002Fen\u002Fblog\u002Foffline-speech-recognition.md","Offline Speech Recognition: What Works Locally and Where the Limits Are",{"type":8,"value":209,"toc":315},[210,213,216,220,223,226,238,242,248,251,254,257,260,264,267,270,274,277,298,308,312],[11,211,212],{},"When you dictate a note to your phone or tap \"voice input\" in your browser, something happens that people rarely think about. Your voice travels to someone else's server, gets turned into text there, and comes back. Convenient. Right up until you find yourself with no connection, inside an environment with strict data rules, or simply stop to wonder where the recording of your voice now lives and who can access it.",[11,214,215],{},"Offline speech recognition works differently. The entire path from microphone to finished text runs right on your own computer. Below we'll break down what that means in practice and where the real boundary of \"local\" lies, because that boundary is often not where the marketing draws it.",[18,217,219],{"id":218},"what-offline-really-means","What \"Offline\" Really Means",[11,221,222],{},"\"Offline\" isn't about an app that has forgotten how to update. It's about where your voice gets processed.",[11,224,225],{},"Compare two paths. In the cloud version, audio from your microphone goes to the service's server, gets recognized there, and comes back to you as text. Without a connection nothing works, and you have no control over what happens to the recording on the other end. In the local version, recognition runs right on the device, the text is ready instantly, and the connection isn't needed for the dictation itself at all.",[11,227,228,229,232,233,237],{},"That leads to the key point: privacy comes by default. Not because someone promised \"we won't store your data,\" but because there's simply nowhere to send it. We describe how this works in Speech Dock in more detail on our ",[90,230,231],{"href":128},"privacy"," and ",[90,234,236],{"href":235},"\u002Fen\u002Fsecurity","security"," pages.",[18,239,241],{"id":240},"where-the-boundary-of-local-processing-lies","Where the Boundary of Local Processing Lies",[11,243,244],{},[35,245],{"alt":246,"src":247},"What stays on your device versus what needs a connection: local processing against the cloud","\u002Fblog\u002Finfographics\u002Fcloud-vs-local.en.png",[11,249,250],{},"The honest answer: not everything in an app has to work offline, and that's fine. The only question is what stays on the device and what occasionally needs a connection.",[11,252,253],{},"Everything that touches your voice and your text always stays local. That means capturing audio from the microphone, converting speech to text itself, the further handling of the finished text (adding punctuation, formatting), and the history of your recordings. None of it leaves your device.",[11,255,256],{},"A connection may be needed only for things that have nothing to do with the content of your recordings: the first install of the app and downloading language data, checking for updates, and activating a paid license.",[11,258,259],{},"As you can see, the boundary runs exactly along the content. Downloading the app over the internet is a one-time thing: you install it once and forget about it. But after installation, your voice and your transcripts never go anywhere, so you can dictate completely offline.",[18,261,263],{"id":262},"what-ordinary-services-do-with-your-recording","What Ordinary Services Do With Your Recording",[11,265,266],{},"To a cloud service, your voice is input data for someone else's infrastructure. And even when the service is well-intentioned, a few questions remain that you have no guaranteed answer to. How long is the recording and its transcript kept? Is your voice used to train someone else's systems? Who has access to the data, and in what jurisdiction do the servers sit?",[11,268,269],{},"Local processing removes these questions all at once. The data never leaves the device, so there's nothing to answer for. For personal notes that's simply pleasant. But for work documents, client correspondence, or any sensitive information, it's often a hard requirement, without which the tool can't be put to use at all.",[18,271,273],{"id":272},"how-to-choose-an-offline-solution-what-to-look-for","How to Choose an Offline Solution: What to Look For",[11,275,276],{},"Not every app that calls itself \"local\" actually keeps your voice with you. A word in a description costs nothing, so it's easier to check for yourself. Here's what I'd look at.",[78,278,279,282,285,292,295],{},[46,280,281],{},"Does dictation work without a connection. The most honest test: turn off the internet and try to dictate some text. If recognition keeps working, the processing really does run on the device.",[46,283,284],{},"Where the text goes. A good desktop solution sends the recognized text straight into the active window (editor, messenger, browser) instead of making you copy it by hand out of its own little box.",[46,286,287,288,232,290,130],{},"Support for your platform. Check that the app runs natively on your system, not through some shim layer. Speech Dock, for example, is built for ",[90,289,114],{"href":113},[90,291,119],{"href":118},[46,293,294],{},"What happens to the history. It's worth confirming whether the history of your recordings is stored on the device and whether you can delete it whenever you want.",[46,296,297],{},"Transparency about the network. It's fine for an app to go online for updates and activation. Sending your audio there is not. These two things matter to tell apart, and they're often deliberately blurred together.",[11,299,300,301,304,305,130],{},"If the desktop scenario on Linux, with its zoo of windowing systems, is exactly what you care about, there's a separate deep dive: ",[90,302,303],{"href":92},"voice input on Linux: X11, Wayland, and the workflow",". And if you're choosing between a ready-made app and building your own solution on a low-level engine, there's an article for that: ",[90,306,307],{"href":193},"Speech Dock or Whisper.cpp",[18,309,311],{"id":310},"in-short","In Short",[11,313,314],{},"So \"offline\" here isn't a pretty word on a landing page but something you can actually verify: turn off the network, dictate a paragraph, and it either works or it doesn't. If privacy is not a nice bonus for you but the condition under which the tool can be used at all, local processing is the most direct way to get it. Let the rest (updates, the license) go online as it pleases; it has nothing to do with your voice.",{"title":163,"searchDepth":164,"depth":164,"links":316},[317,318,319,320,321],{"id":218,"depth":164,"text":219},{"id":240,"depth":164,"text":241},{"id":262,"depth":164,"text":263},{"id":272,"depth":164,"text":273},{"id":310,"depth":164,"text":311},"Basics","What offline speech-to-text actually is, which tasks it handles right on your device without the internet, and where the real boundary of local processing lies.",[325,328,331,334],{"question":326,"answer":327},"Does offline speech recognition work completely without the internet?","Yes. Turning your voice into text happens on your own device, so you can dictate on a plane, on the road, or inside a locked-down network with no connection. You need the internet only once: to download the app and its language data.",{"question":329,"answer":330},"How is offline recognition different from voice input in a browser or on a phone?","Built-in voice input usually sends your audio to the service's servers and converts it to text there. With local processing, the recording never leaves your device. It is a different approach to privacy, not just a different button.",{"question":332,"answer":333},"Is local recognition noticeably worse in quality?","Modern local solutions handle dictation, notes, and messages in everyday language just fine. The difference is usually not \"cloud versus device\" but how well the app is tuned to your speech, your language, and your scenario.",{"question":335,"answer":336},"What exactly stays on the device?","Your microphone audio, the interim and final text, and the history of your recordings. None of it needs to be sent to an external server to get a result.","\u002Fog\u002Fblog\u002Fen\u002Foffline-speech-recognition.png",{},{"title":207,"description":323},"en\u002Fblog\u002Foffline-speech-recognition",[200,198,231,342],"voice input","offline-speech-recognition","cpcnoRIbFAaEljEEymmU87CxhbsFvUfcIb0kY9Q2D78",{"id":346,"title":347,"body":348,"category":114,"date":173,"description":561,"draft":175,"extension":176,"faq":562,"image":575,"meta":576,"navigation":192,"path":92,"seo":577,"stem":578,"tags":579,"translationKey":582,"updated":203,"__hash__":583},"blogEn\u002Fen\u002Fblog\u002Flinux-voice-input.md","Voice Input on Linux: X11, Wayland, and a Workflow That Sticks",{"type":8,"value":349,"toc":552},[350,353,356,360,363,366,369,375,379,385,388,391,394,398,401,404,420,457,467,470,473,477,480,483,486,490,493,507,510,514,537,539,545,548],[11,351,352],{},"There is a funny paradox with speech recognition on Linux. The recognition engine itself stopped being a problem long ago, and local solutions work great. But \"just dictate text into any window\" turns out to be a surprisingly fiddly task. And it has nothing to do with recognition quality, it is all about how the Linux desktop is built in the first place.",[11,354,355],{},"Let's unpack why, how X11 and Wayland differ for dictation, and how to set up a workflow that genuinely saves you time.",[18,357,359],{"id":358},"why-voice-input-on-linux-is-its-own-story","Why Voice Input on Linux Is Its Own Story",[11,361,362],{},"On Windows and macOS, you can paste text into the active app through a single system API, and that question was settled years ago. On Linux, the desktop is fragmented. There are two windowing systems (the old X11 and the new Wayland), several desktop environments (GNOME, KDE, Sway, Hyprland, and others), and each one handles \"input emulation\" in its own way.",[11,364,365],{},"For voice input, that means the task splits into two independent parts.",[11,367,368],{},"The first part, recognizing speech and turning your voice into text, is local and does not depend on the windowing system. The second part, delivering the finished text into the right window and inserting it where the cursor sits, is exactly where the differences between X11 and Wayland begin.",[11,370,371,372,374],{},"In Speech Dock, the first part is fully ",[90,373,200],{"href":123},", so your voice never leaves the device. The second part depends on your environment, and it is worth understanding.",[18,376,378],{"id":377},"x11-vs-wayland-whats-different-for-dictation","X11 vs. Wayland: What's Different for Dictation",[11,380,381],{},[35,382],{"alt":383,"src":384},"X11 vs Wayland for voice input on Linux: active-window capture, text auto-paste, global hotkey","\u002Fblog\u002Finfographics\u002Fx11-vs-wayland.en.png",[11,386,387],{},"X11 is an old but still widely used windowing system. Its design is permissive: one app can happily \"press keys\" on behalf of the user and see which window is currently active. For voice input, that's a gift. Auto-pasting text and detecting the active window work with no extra setup at all.",[11,389,390],{},"Wayland is the modern replacement for X11, designed with a strong emphasis on security and app isolation. Those same principles are exactly what make auto-paste harder. By default, an app cannot simply emulate the keyboard in another window or peek at which window is active. This is not a bug but a deliberate architectural choice: a window should not know what its neighbor is doing.",[11,392,393],{},"So on Wayland you'll have to configure a few conveniences by hand, more on that below. In return, you get a far stricter security model across the entire desktop.",[18,395,397],{"id":396},"auto-paste-how-it-works","Auto-Paste: How It Works",[11,399,400],{},"Auto-paste is when recognized text appears right where the cursor sits, with no manual Ctrl+V. How exactly it's done depends on the windowing system.",[11,402,403],{},"On X11, everything works right after installation. You dictate, the text shows up in the active field, end of story.",[11,405,406,407,411,412,415,416,419],{},"On Wayland, you'll need the ",[408,409,410],"code",{},"ydotool"," system service with the ",[408,413,414],{},"ydotoold"," daemon running. It gives the app a channel for input emulation through ",[408,417,418],{},"\u002Fdev\u002Fuinput",". The setup is one-time, set it and forget it:",[421,422,426],"pre",{"className":423,"code":424,"language":425,"meta":163,"style":163},"language-bash shiki shiki-themes github-light github-dark","# enable and start the auto-paste daemon\nsystemctl --user enable --now ydotoold\n","bash",[408,427,428,437],{"__ignoreMap":163},[429,430,433],"span",{"class":431,"line":432},"line",1,[429,434,436],{"class":435},"sJ8bj","# enable and start the auto-paste daemon\n",[429,438,439,443,447,451,454],{"class":431,"line":164},[429,440,442],{"class":441},"sScJk","systemctl",[429,444,446],{"class":445},"sj4cs"," --user",[429,448,450],{"class":449},"sZZnC"," enable",[429,452,453],{"class":445}," --now",[429,455,456],{"class":449}," ydotoold\n",[11,458,459,460,462,463,466],{},"On top of that, your user needs access to ",[408,461,418],{},". This is usually granted by adding the user to the ",[408,464,465],{},"input"," group.",[11,468,469],{},"And what if the daemon isn't set up? Nothing bad happens. The recognized text is automatically copied to the clipboard, and you paste it manually with your usual shortcut. Dictation works either way, only the very last step is automated.",[11,471,472],{},"There's one more Wayland quirk: it has no public way to find out which window is currently active. So before dictating, just make sure once that the right app is in focus, and the text will go exactly where you want it.",[18,474,476],{"id":475},"on-screen-recording-indicator","On-Screen Recording Indicator",[11,478,479],{},"When you're dictating, it helps to see that recording is actually happening. Speech Dock shows a compact indicator pill. Its behavior, as you might have guessed, also depends on the environment.",[11,481,482],{},"On Sway, Hyprland, and recent versions of KDE Plasma, it's a full floating indicator on top of the windows. GNOME, however, doesn't implement the windowing protocol it needs, so the pill is simplified there. This has no effect on dictation or text pasting itself, only the looks take a hit.",[11,484,485],{},"It's a good example of how the same feature behaves completely differently across various Linux desktops. And it also explains why a ready-made app that has already sorted out all these differences saves you a ton of time.",[18,487,489],{"id":488},"a-practical-workflow","A Practical Workflow",[11,491,492],{},"Here's what comfortable dictation looks like in everyday work:",[78,494,495,498,501,504],{},[46,496,497],{},"Set a global hotkey that starts and stops recording from any app. No need to switch to a separate window.",[46,499,500],{},"Put the cursor where you want the text to go. On Wayland, also make sure the right window is in focus.",[46,502,503],{},"Dictate. Speak naturally; the app recognizes your speech locally and formats the text for you.",[46,505,506],{},"The text lands in place. On X11 and on a configured Wayland, it pastes itself; otherwise it's already waiting for you on the clipboard.",[11,508,509],{},"This workflow works equally well for a quick chat message and for a long note or an email draft. The only difference is how much you've said.",[18,511,513],{"id":512},"what-to-check-before-you-start","What to Check Before You Start",[43,515,516,523,526,534],{},[46,517,518,519,522],{},"Whether the current build for your system is installed. The step-by-step ",[90,520,521],{"href":113},"Linux installation guide"," covers .deb, AppImage, and popular distributions.",[46,524,525],{},"Which windowing system you're on, X11 or Wayland. This affects only auto-paste, not recognition.",[46,527,528,529,531,532,130],{},"If you want automatic pasting on Wayland: whether ",[408,530,414],{}," is configured and you have access to ",[408,533,418],{},[46,535,536],{},"Whether you've set a convenient hotkey to start and stop recording.",[18,538,311],{"id":310},[11,540,541,542,544],{},"The whole trick with voice input on Linux is that speech is recognized the same way everywhere, but text reaches the target window in different ways. On X11 it pastes itself, with zero configuration. On Wayland you'll have to make friends with ",[408,543,410],{}," once, and if you can't be bothered, the text still won't be lost, it'll be waiting on the clipboard.",[11,546,547],{},"So you can safely skip the scary stories about \"voice input not working on Linux.\" It works. You just need to understand once which windowing system you're on and tweak a couple of small things for it. After that, you just dictate and stop thinking about it.",[549,550,551],"style",{},"html pre.shiki code .sJ8bj, html code.shiki .sJ8bj{--shiki-default:#6A737D;--shiki-dark:#6A737D}html pre.shiki code .sScJk, html code.shiki .sScJk{--shiki-default:#6F42C1;--shiki-dark:#B392F0}html pre.shiki code .sj4cs, html code.shiki .sj4cs{--shiki-default:#005CC5;--shiki-dark:#79B8FF}html pre.shiki code .sZZnC, html code.shiki .sZZnC{--shiki-default:#032F62;--shiki-dark:#9ECBFF}html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}",{"title":163,"searchDepth":164,"depth":164,"links":553},[554,555,556,557,558,559,560],{"id":358,"depth":164,"text":359},{"id":377,"depth":164,"text":378},{"id":396,"depth":164,"text":397},{"id":475,"depth":164,"text":476},{"id":488,"depth":164,"text":489},{"id":512,"depth":164,"text":513},{"id":310,"depth":164,"text":311},"Why voice input on Linux is trickier than it looks: how X11 and Wayland differ for dictation, how auto-paste works, and how to build a desktop workflow that saves time.",[563,566,569,572],{"question":564,"answer":565},"Does voice input on Linux work on both X11 and Wayland?","Yes. Speech recognition itself does not depend on the windowing system. The differences only show up at the moment the finished text is pasted into the active window: on X11 this works out of the box, while on Wayland you sometimes need an extra system component.",{"question":567,"answer":568},"Why doesn't text paste automatically on Wayland?","For security reasons, Wayland restricts software input emulation. For an app to paste text into another window on its own, you need the ydotool service with the ydotoold daemon running and access to \u002Fdev\u002Fuinput. Without it, the text still lands on the clipboard, so you can paste it manually.",{"question":570,"answer":571},"Do I need to be in the input group?","For auto-paste on Wayland via ydotool, your user needs access to \u002Fdev\u002Fuinput, which is usually granted by adding the user to the input group. This is a one-time setup.",{"question":573,"answer":574},"Which desktop environments support everything fully?","A full on-screen recording indicator works on Sway, Hyprland, and recent versions of KDE Plasma. On GNOME, some window features are limited by the environment itself, so the indicator is simplified, but dictation and text pasting still work.","\u002Fog\u002Fblog\u002Fen\u002Flinux-voice-input.png",{},{"title":347,"description":561},"en\u002Fblog\u002Flinux-voice-input",[114,580,581,342,201],"Wayland","X11","linux-voice-input","pFxQPpO4MgQZ2_wGotPMURb4-9QfKYKmqVQW2-W9790",1782298628017]