S2 is here. Yeah, It's Open-source.
Fish S2 is a big step beyond S1, redefining expressive voice AI. You can now write any emotion cues anywhere in the text, and hear the speech flow exactly how you direct it.
Use S2 NowFish S2 is a big step beyond S1, redefining expressive voice AI. You can now write any emotion cues anywhere in the text, and hear the speech flow exactly how you direct it.
Craft long audio effortlessly with the new Story Studio. Add pauses, manage speakers, regenerate clips, and fine-tune every moment with precision.

Fish Audio S1 generates lifelike voices that capture emotion, rhythm, and nuance with remarkable realism.

S1-mini (0.5B) is now open-source, bringing emotion and tone control in a compact distilled model. It delivers the core capabilities of our flagship S1 (4B), which is available on Fish Audio.

v1.6 brings higher stability, emotion support, and improved multilingual performance, now available for instant use online.

Trained on 700k+ hours of audio data, Fish v1.4 is the most powerful open-source TTS model supporting more than 8 languages.

Open-source release of both the v1.2 pretrain and SFT model, supporting auto-reranking for stable generation.

Fish's first open-source TTS model release.