Is There an AI That Can Clone Famous Voices? What You Need to Know in 2026
Feb 22, 2026
AI can replicate aspects of a famous voice from a short audio clip, sometimes as little as 15 seconds. The technology has advanced to the point where high-quality clones can sound extremely convincing, and the raw material for famous voices is everywhere: interviews, movies, podcasts, and commencement speeches.
That's not the hard part. The hard part is that California, Tennessee, and the EU have all passed laws in the last 18 months that treat someone's voice as protected property or personality right. Clone a celebrity without consent, use it in a video, and you're not just facing a YouTube takedown. You may be exposing yourself to serious legal liability.
Yes, the Tech Exists. No, It's Not That Simple.
The short answer: AI can clone virtually any voice it has enough audio data for, and famous voices have abundant publicly available recordings. Public speeches, interviews, movies, and podcasts. The raw material is everywhere.
Modern voice cloning models analyze pitch, timbre, rhythm, and speech patterns from audio samples as short as 10 to 15 seconds. They generate synthetic speech that captures the source speaker's unique vocal fingerprint. In controlled tests, listeners often struggle to distinguish between original and synthetic speech. 2025 industry reports described the technology as nearing an “indistinguishable threshold,” noting that natural intonation, pauses, and even breathing noise can now be convincingly reproduced.
That's the capability side. The permission side is where it gets complicated.
The Legal Line Most People Don't See Coming
Cloning a celebrity's voice without consent isn't just ethically questionable. In a growing number of jurisdictions, it can be unlawful, especially in commercial contexts.
In the U.S., right-of-publicity laws in states like California, New York, and Tennessee protect an individual's control over the commercial use of their voice. California's AB 1836, effective January 2025, extends this protection to deceased personalities, meaning you can't clone a late actor's voice for a commercial project without permission from their estate. Tennessee's ELVIS Act goes further, covering both actual recordings and AI-generated recreations.
At the federal level, the proposed NO FAKES Act would make it unlawful to create or distribute an AI-generated replica of anyone's voice or likeness without consent, with limited exceptions for satire, parody, and news reporting.
The EU's AI Act classifies certain voice cloning applications as high-risk, requiring transparency and strict safeguards. Denmark has amended its copyright-related protection to extend personality-style protections to voice likeness, with postmortem protections lasting decades.
Here's the bottom line: if you clone a famous person's voice and use it commercially, you're likely exposing yourself to civil liability, and potentially regulatory penalties. The widely reported2024 dispute involving a voice that closely resembled Scarlett Johansson demonstrated how quickly legal and reputational risk can escalate. The backlash forced the company to withdraw the voice.
What People Actually Want (and How to Get It Legally)
When someone searches "AI that can clone famous voices," they're rarely attempting a malicious deepfake. Most often, they want one of three things:
A specific vocal quality. They want that deep, authoritative narrator tone for explainer videos. Or a warm, conversational style for a podcast intro. They're drawn to the sound profile, not the legal identity behind it.
A character voice for creative projects. Game developers need distinct NPC voices. Audiobook producers need a narrator who can sustain engagement across 10 hours of content. The goal is emotional range and vocal character, not impersonation of a real person.
Multilingual content in a consistent voice. Creators expanding globally want the same voice speaking Japanese, Spanish, and English naturally, without heavy accent artifacts. Celebrity voices often serve as a shorthand quality benchmark.
The good news: you don't need to clone a real celebrity to achieve these outcomes. AI voice platforms offer high-quality, legally safe alternatives, allowing you to select or design voices with similar tonal qualities without infringing on anyone’s rights.
2000,000+ Voices, Zero Cease-and-Desist Letters
This is where the practical solution begins.
Fish Audio takes a different approach to the "famous voice" problem. Instead of encouraging users to clone existing public figures, the platform maintains a community voice library with over 200,000 voices spanning a range of tones, styles, ages, and accents. You'll find deep baritone narrators, energetic young presenters, calm meditation guides, and character voices ranging from grizzled villains to cheerful sidekicks.
The difference: every voice in the library is either user-contributed with consent or synthetically generated, meaning reduced right-of-publicity risks when used appropriately.
For creators seeking the specific vocal quality they admire in a famous voice, the library acts as a casting directory. Filter by language, gender, tone, and style. Preview samples. Select the one that fits your project. The whole process takes minutes, not hours or days.
When You Actually Need Your Own Voice (Cloned)
Sometimes the library isn't enough. You need your voice, or a voice you have explicit permission to use, speaking content you didn't record.
Fish Audio's voice cloning requires just 10 seconds of reference audio to generate a clone. That's less than the 60+ seconds many competitors require. The workflow is straightforward: upload a clean audio sample, allow the model to analyze it, and generate new speech within minutes.
What differentiates it from basic cloning tools is controllability. Fish Audio's S1 model accepts emotion tags such as "(excited)," "(whisper)," or "(nervous)" to adjust delivery per passage. A single cloned voice can sound professional in one paragraph and warm in the next, without requiring separate recording sessions.
That flexibility becomes critical in a long-form project. Monotone delivery reduces engagement. Emotional range sustains attention.
The Multilingual Angle That Changes the Math
Here's where the gap between "cloning a famous voice" and "building a voice strategy" becomes clear.
Most famous voices are iconic in a single language. A well-known English narrator may not translate naturally into Japanese, Spanish, or Arabic.
Fish Audio currently supports 8 languages with natural cross-language performance. A voice cloned from English samples can speak Chinese or Japanese without the heavy accent artifacts common in other tools. In practical terms, this allows creators to maintain a consistent brand voice across markets without hiring separate voice actors for each region.
For content teams doing localization, that's a meaningful reduction in cost and time. Traditional multilingual voiceover for a 10-minute video across 5 languages typically runs $2,000 to $5,000 and takes 1 to 2 weeks. AI-powered multilingual TTS can compress that timeline to hours ata fraction of the cost.
What About Long-Form Content? Story Studio Fills the Gap.
Short clips and social media voiceovers are one thing. Producing a 6-hour audiobook or a full season of podcast episodes is another.
Fish Audio's Story Studio is designed for long-form production. It functions as a workbench where you can assign different voices to different characters, control pacing and emotion across chapters, and export files that meet ACX and Audible technical specifications.
For independent authors and small publishers who can't afford $3,000 to $10,000 per finished hour of professional narration, this shifts audiobook production from "someday" to "this quarter."
The emotion tag system is especially important in long-form content. A narrator who sounds identical on page 1 and page 300 risks losing listener engagement. Story Studio allows scene-by-scene tuning, similar to what professional audiobook directors do with human narrators, but without studio overhead.
The Ethical Playbook: How to Use Voice AI Without Crossing Lines
Voice cloning technology is powerful, and the temptation to replicate a famous voice is real. Sustainable creators and companies tend to follow a consistent set of practices:
| Practice | Why It Matters |
|---|---|
| Clone only voices you own or have written consent to use | Avoids right-of-publicity claims and potential fraud charges |
| Use voice libraries for "inspired by" vocal styles | Achieve desired quality without impersonation risklegal exposure |
| Label AI-generated audio in published content | Builds trust and meets emerging transparency laws |
| Maintain consent documentation and audio provenance records | Protects against disputes or regulatory scrutiny |
The EU AI Act, China's AI content labeling rules (effective September 2025), and proposed U.S. legislation all point in the same direction: synthetic voices will require disclosure. Preparing for compliance now is significantly easier than retrofitting policies later.
For Developers: the API Route
If you're building an app, game, or customer service system that needs voice generation at scale, Fish Audio's API offers millisecond-level latency with streaming support. That's fast enough for real-time conversational agents, in-game dialogue, and interactive voice response systems.
The API supports the same emotion tags and multilingual capabilities as the consumer product, reducing the need to integrate multiple providers. Pricing starts with a free tier and scales by usage.
For context: Fish Audio's open-source model, Fish Speech V1.5, was ranked among the top 3 open-source voice models for 2026, achieving an ELO score of 1339 in independent TTS Arena evaluations. The commercial platform builds on that foundation by adding further performance optimization and enterprise support.
Conclusion
Can AI clone famous voices? Technically, yes. Legally and ethically, it's a rapidly tightening regulatory environment.
The smarter play for creators, developers, and businesses is to shift the question from "can I clone this celebrity's voice?" to "can I find or build a voice that delivers the same impact?" With libraries of 2,000,000+ voices, 10-second voice cloning, emotion-controlled delivery, and multilingual output, the tools to do that already exist.
The voice you need does not have to be famous. It onlyneeds to serve your project.
Start exploring at fish.audio, or dive into the API docs if you're building something more technical.

Kyle is a Founding Engineer at Fish Audio and UC Berkeley Computer Scientist and Physicist. He builds scalable voice systems and grew Fish into the #1 global AI text-to-speech platform. Outside of startups, he has climbed 1345 trees so far around the Bay Area. Find his irresistibly clouty thoughts on X at @kile_sway.
Read more from Kyle Cui >