How to Use Inline Tags in Fish Audio S2

Mar 10, 2026

How to Use Inline Tags in Fish Audio S2

Fish Audio S2 supports inline tags - short natural-language cues placed in square brackets anywhere in your text — to control how speech is delivered. This guide covers the supported tags, how to use them, and tips for getting the best results.


Basic Syntax

Place a tag in square brackets immediately before the word or phrase it should affect:

The door was open. [whispering] I didn't want to go inside.

Tags can be placed at any position in the text, and you can use multiple tags in a single generation.


S2 accepts free-form natural-language tags — you're not limited to a fixed list. That said, the tags below are well-tested and produce consistently strong results. Use them as starting points, or write your own descriptions (e.g. [speaking slowly, almost hesitant]) for more specific control.

Breathing & Vocal Reactions

TagDescription
[clears throat]Throat-clearing sound before speaking
[inhalation] / [inhale]Audible breath in
[exhale]Audible breath out
[sigh]Expressive sigh
[panting]Heavy, rapid breathing
[breathing]General audible breathing
[gasp]Sharp, sudden intake of breath

Vocal Sounds

TagDescription
[groan]Low sound of discomfort or exasperation
[moaning]Extended vocal sound of pain or displeasure
[sobbing]Crying with convulsive breaths
[crying]Audible tears in voice
[laughing]Full laughter
[chuckling]Soft, quiet laughter
[giggle]Light, high-pitched laughter

Pacing

TagDescription
[pause]Brief silence
[short pause]Shorter beat
[long pause]Extended silence for dramatic effect

Voice Style

TagDescription
[whispering] / [whispering voice]Hushed, breathy delivery
[soft voice]Quiet and gentle
[low voice]Deeper, lower-pitched register
[loud voice]Raised volume
[shouting]Full-volume yelling

Emotion

TagDescription
[excited]High energy, upbeat
[angry]Harsh, forceful tone
[sad]Heavy, downcast delivery

Other

TagDescription
[emphasis]Stress on the following word or phrase
[rustling sound]Background rustling noise

Placement

Tags affect what comes after them. Place the tag right before the point where you want the shift to happen.

Good — tag at the transition point:

I thought everything was fine. [whispering] Then I heard the noise.

Less effective — tag too early:

[whispering] I thought everything was fine. Then I heard the noise.

In this case the entire passage will be whispered, including the first sentence.


Combining Tags

You can chain multiple tags across a passage to create shifts in delivery:

[soft voice] I wasn't sure what to say. [long pause] [loud voice] But then it hit me.

Vocal reaction tags can be placed between sentences for natural-sounding transitions:

That was the third time this week. [sigh] I really need to fix that.

Multi-Speaker Dialogue

S2 supports multi-speaker, multi-turn generation with per-speaker inline tag control. Multi-speaker is coming soon to the Fish Audio playground and API — stay tuned.


Tips

Start simple. A single well-placed [whispering] or [sigh] can transform a passage. You don't need a tag on every sentence.

Use pauses for pacing. [pause] and [long pause] are among the most useful tags for making speech feel natural, especially before emotional shifts.

Let reactions carry emotion. Instead of relying on emotion tags alone, try combining with reactions: [sigh] [sad] I just don't know anymore. The sigh grounds the emotion physically.

Test and iterate. Different voices may respond to tags with varying intensity. If a tag feels too subtle, try reinforcing it with context in the surrounding text.


Create voices that feel real

Start generating the highest quality audio today.

Already have an account? Log in

Share this article


Kyle Cui

Kyle CuiX

Kyle is a Founding Engineer at Fish Audio and UC Berkeley Computer Scientist and Physicist. He builds scalable voice systems and grew Fish into the #1 global AI text-to-speech platform. Outside of startups, he has climbed 1345 trees so far around the Bay Area. Find his irresistibly clouty thoughts on X at @kile_sway.

Read more from Kyle Cui >

Recent Articles

View all >