Is 10 seconds really enough for a good clone?

For capturing your basic voice identity, yes. Longer samples (30 seconds to a minute) can improve quality slightly, but the difference is smaller than you would expect. The Chatterbox model is specifically optimized for short-sample cloning. Most users find 10 seconds sufficient.

Can I create multiple voice clones?

Yes. You can clone as many voices as you want. Some creators clone different "versions" of their voice (calm and professional, energetic and casual) from different sample recordings and switch between them depending on the content.

Will the clone sound exactly like me?

Not exactly. It captures your voice signature (pitch, timbre, pace) but smooths out natural speech irregularities. Think of it as a polished version of your voice. Most people find it recognizably theirs while sounding slightly more consistent than their natural speech.

Can someone else clone my voice without my permission?

Murmur does not have any identity verification for voice cloning, similar to how a photo editor does not verify who is in a photo. The ethical responsibility falls on the user. Murmur's local-only approach does mean your voice data is not sitting on a cloud server where it could be accessed by others.

Does the cloned voice work with all of Murmur's models?

Voice cloning currently works through the Chatterbox model, which is optimized for this feature. The other models (Kokoro, Qwen3, Fish Audio) use their built-in voice libraries. Future updates may extend cloning support to additional models.

Can I export the voice model and use it elsewhere?

The voice embedding is stored in Murmur's internal format. It is not currently exportable as a standalone model file. This is partly a technical constraint and partly a privacy measure, preventing easy redistribution of cloned voices.

Tutorial

Clone Your Voice in 10 Seconds (No Cloud Upload)

How to create a personal AI narrator voice from a short recording, entirely on your Mac.

April 15, 2026·6 min read

Your Voice, Without the Microphone

Voice cloning used to require uploading minutes of audio to a cloud service, waiting for processing, and trusting a third party with one of your most personal biometric identifiers. That setup works, but it has always felt uncomfortable. Your voice is not a password you can reset if it leaks.

Murmur takes a different approach. Record 10 seconds of speech on your Mac, and the app builds a voice model locally using the Chatterbox engine. Your recording never leaves your machine. No upload, no cloud account, no server-side processing. The cloned voice lives on your Mac alongside the original sample, and you can delete both at any time.

The result is a synthetic voice that captures your tone, pitch, and speaking rhythm. It will not be a perfect replica. But for narration, content creation, and personal branding, it is remarkably close.

What You Need

A Mac with Apple Silicon (M1, M2, M3, or M4)
Murmur installed and set up (initial model download takes a few minutes)
A quiet room with minimal background noise
Your Mac's built-in microphone works fine, though an external mic produces better results
10 seconds of clear, natural speech

Recording Tips for Best Results

The quality of your 10-second sample directly affects the quality of the cloned voice. A clean recording with natural speech patterns produces the best results. Here is how to get the most from those 10 seconds.

Find a quiet space. Close windows, turn off fans, and move away from appliances. Background noise degrades cloning quality significantly.
Speak at your natural pace. Do not try to sound professional or broadcasted. The model captures your natural rhythm, so speak the way you normally would when explaining something to a friend.
Enunciate clearly without over-articulating. Hit your consonants but do not sound like you are doing a diction exercise.
Read a passage that includes varied sounds. A sentence like "The quick brown fox jumped over the lazy dog near the buzzing hive" covers more phonemes than a monotone greeting.
Keep a consistent distance from the microphone, roughly 6 to 12 inches. Moving closer or farther during the recording creates volume inconsistencies the model will reproduce.
Do one test recording first, listen back, and re-record if you hear room echo, mouth clicks, or background hum.

The Cloning Process

Open Murmur and navigate to the voice library. Select "Clone Voice" and either record directly in the app or import an existing audio file. The app accepts WAV, MP3, and M4A formats. If your sample is longer than 10 seconds, Murmur uses the first 10 seconds.

Processing takes 15 to 30 seconds on most Apple Silicon Macs. The Chatterbox model analyzes your voice characteristics and creates a voice embedding, a mathematical representation of your vocal identity. This embedding is stored locally in Murmur's voice library alongside the built-in voices.

Once created, your cloned voice appears in the voice picker like any other voice. Type or paste text, select your voice, and generate. The output carries your tonal characteristics while being driven by the AI model's speech patterns. It sounds like you reading the text, not a generic voice with your pitch layered on top.

What to Expect (and What Not To)

Be realistic about what 10-second cloning can and cannot do. It captures your general voice quality: pitch range, timbre, speaking pace, and basic intonation patterns. It does not capture specific mannerisms, your laugh, regional micro-inflections, or the way you emphasize particular words.

The cloned voice works best for narration and informational content. It will sound like a version of you that is slightly smoother and more even than your actual speech. For most creator use cases (blog narration, course content, YouTube voiceover), this is actually a feature, not a limitation. Listeners get your voice identity without the ums, false starts, and inconsistencies of live recording.

Accents are generally preserved. If you have a British accent, the clone will sound British. If you speak with a Southern American drawl, that comes through. Strong accents reproduce more faithfully because they provide more distinctive vocal features for the model to latch onto.

Use Cases for Your Cloned Voice

Personal brand narration: your newsletter, blog, or YouTube channel sounds like you, even when you do not feel like recording.
Consistent course content: produce 20 hours of training material in your voice without 20 hours behind a microphone.
Audiobook narration: non-fiction authors can narrate their own books at a fraction of the time and effort of live recording.
Content repurposing: turn your written posts into audio versions that sound like you reading them aloud.
Prototyping: test how your script sounds in your voice before committing to a professional recording session.

The Privacy Advantage

Cloud voice cloning services require you to upload your voice sample to their servers. Once uploaded, your biometric data exists on infrastructure you do not control. Terms of service vary, and policies change. Some services use uploaded voice data to improve their models, meaning your voice characteristics become part of a training dataset.

With Murmur, the entire cloning pipeline runs on your Mac's neural engine via MLX. Your voice sample is processed locally, the embedding is stored locally, and the generated audio stays local. There is no account, no upload, and no server that ever sees your voice data. If you delete the voice from Murmur, it is gone.

Frequently Asked Questions

Your voice. Your Mac. 10 seconds.

Clone your voice locally, no cloud upload required. Generate unlimited narration in your own voice for $49. One purchase, permanent access.

Buy Murmur · $49

macOS 14+ · Apple Silicon required · 7-day refund policy