Clone Your Voice in 10 Seconds (No Cloud Upload)
How to create a personal AI narrator voice from a short recording, entirely on your Mac.
Your Voice, Without the Microphone
Voice cloning used to require uploading minutes of audio to a cloud service, waiting for processing, and trusting a third party with one of your most personal biometric identifiers. That setup works, but it has always felt uncomfortable. Your voice is not a password you can reset if it leaks.
Murmur takes a different approach. Record 10 seconds of speech on your Mac, and the app builds a voice model locally using the Chatterbox engine. Your recording never leaves your machine. No upload, no cloud account, no server-side processing. The cloned voice lives on your Mac alongside the original sample, and you can delete both at any time.
The result is a synthetic voice that captures your tone, pitch, and speaking rhythm. It will not be a perfect replica. But for narration, content creation, and personal branding, it is remarkably close.
What You Need
- A Mac with Apple Silicon (M1, M2, M3, or M4)
- Murmur installed and set up (initial model download takes a few minutes)
- A quiet room with minimal background noise
- Your Mac's built-in microphone works fine, though an external mic produces better results
- 10 seconds of clear, natural speech
Recording Tips for Best Results
The quality of your 10-second sample directly affects the quality of the cloned voice. A clean recording with natural speech patterns produces the best results. Here is how to get the most from those 10 seconds.
- Find a quiet space. Close windows, turn off fans, and move away from appliances. Background noise degrades cloning quality significantly.
- Speak at your natural pace. Do not try to sound professional or broadcasted. The model captures your natural rhythm, so speak the way you normally would when explaining something to a friend.
- Enunciate clearly without over-articulating. Hit your consonants but do not sound like you are doing a diction exercise.
- Read a passage that includes varied sounds. A sentence like "The quick brown fox jumped over the lazy dog near the buzzing hive" covers more phonemes than a monotone greeting.
- Keep a consistent distance from the microphone, roughly 6 to 12 inches. Moving closer or farther during the recording creates volume inconsistencies the model will reproduce.
- Do one test recording first, listen back, and re-record if you hear room echo, mouth clicks, or background hum.
The Cloning Process
Open Murmur and navigate to the voice library. Select "Clone Voice" and either record directly in the app or import an existing audio file. The app accepts WAV, MP3, and M4A formats. If your sample is longer than 10 seconds, Murmur uses the first 10 seconds.
Processing takes 15 to 30 seconds on most Apple Silicon Macs. The Chatterbox model analyzes your voice characteristics and creates a voice embedding, a mathematical representation of your vocal identity. This embedding is stored locally in Murmur's voice library alongside the built-in voices.
Once created, your cloned voice appears in the voice picker like any other voice. Type or paste text, select your voice, and generate. The output carries your tonal characteristics while being driven by the AI model's speech patterns. It sounds like you reading the text, not a generic voice with your pitch layered on top.
What to Expect (and What Not To)
Be realistic about what 10-second cloning can and cannot do. It captures your general voice quality: pitch range, timbre, speaking pace, and basic intonation patterns. It does not capture specific mannerisms, your laugh, regional micro-inflections, or the way you emphasize particular words.
The cloned voice works best for narration and informational content. It will sound like a version of you that is slightly smoother and more even than your actual speech. For most creator use cases (blog narration, course content, YouTube voiceover), this is actually a feature, not a limitation. Listeners get your voice identity without the ums, false starts, and inconsistencies of live recording.
Accents are generally preserved. If you have a British accent, the clone will sound British. If you speak with a Southern American drawl, that comes through. Strong accents reproduce more faithfully because they provide more distinctive vocal features for the model to latch onto.
Use Cases for Your Cloned Voice
- Personal brand narration: your newsletter, blog, or YouTube channel sounds like you, even when you do not feel like recording.
- Consistent course content: produce 20 hours of training material in your voice without 20 hours behind a microphone.
- Audiobook narration: non-fiction authors can narrate their own books at a fraction of the time and effort of live recording.
- Content repurposing: turn your written posts into audio versions that sound like you reading them aloud.
- Prototyping: test how your script sounds in your voice before committing to a professional recording session.
The Privacy Advantage
Cloud voice cloning services require you to upload your voice sample to their servers. Once uploaded, your biometric data exists on infrastructure you do not control. Terms of service vary, and policies change. Some services use uploaded voice data to improve their models, meaning your voice characteristics become part of a training dataset.
With Murmur, the entire cloning pipeline runs on your Mac's neural engine via MLX. Your voice sample is processed locally, the embedding is stored locally, and the generated audio stays local. There is no account, no upload, and no server that ever sees your voice data. If you delete the voice from Murmur, it is gone.
Frequently Asked Questions
Your voice. Your Mac. 10 seconds.
Clone your voice locally, no cloud upload required. Generate unlimited narration in your own voice for $49. One purchase, permanent access.
macOS 14+ · Apple Silicon required · 7-day refund policy