Skip to content

Commit

Permalink
feat: audio native
Browse files Browse the repository at this point in the history
  • Loading branch information
louisjoecodes committed Dec 22, 2024
1 parent 814ee69 commit 88a6b38
Show file tree
Hide file tree
Showing 3 changed files with 78 additions and 9 deletions.
10 changes: 10 additions & 0 deletions fern/assets/styles/globals.css
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,13 @@
.fern-sidebar-icon {
display: none !important;
}

/* Hide first HR in changelog */
.fern-changelog > main > section + hr {
display: none;
}

/* Reduce padding on first changelog entry */
.fern-changelog > main > section.fern-changelog-entry {
padding-bottom: 0px;
}
2 changes: 2 additions & 0 deletions fern/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,8 @@ js:
strategy: lazyOnload
- path: assets/scripts/audio-player.js
strategy: lazyOnload
- url: https://elevenlabs.io/player/audioNativeHelper.js
strategy: lazyOnload

analytics:
posthog:
Expand Down
75 changes: 66 additions & 9 deletions fern/docs/pages/guides/text-to-speech.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,33 @@ subtitle: 'Learn how to turn text into human-like spoken audio with ElevenLabs.'

## Overview

ElevenLabs Text to Speech (TTS) API turns text into natural-sounding audio with human-like intonation, pacing, and contextual understanding. Our system supports many languages, multiple voice styles, and real-time streaming. Whether you’re creating audiobooks, localizing video content, or building an interactive voice application, ElevenLabs provides the tools and speed you need.
<div
id="elevenlabs-audionative-widget"
data-height="90"
data-width="100%"
data-frameborder="no"
data-scrolling="no"
data-publicuserid="b8723bd5176782a2716d531124554c58dace5eb6b7d831c2ceb7ce48a945c64f"
data-playerurl="https://elevenlabs.io/player/index.html"
>
Loading the{' '}
<a href="https://elevenlabs.io/text-to-speech" target="_blank" rel="noopener">
Elevenlabs Text to Speech
</a>{' '}
AudioNative Player...
</div>

<script src="https://elevenlabs.io/player/audioNativeHelper.js" type="text/javascript"></script>

ElevenLabs Text to Speech (TTS) API transforms text into lifelike audio with nuanced intonation, pacing and emotional awareness. [Our models](/docs/models) smoothly adapt to textual cues across 32 languages and multiple voice styles. Ideal for everything from global brand campaigns to immersive entertainment.

<elevenlabs-audio-player
audio-title="George"
audio-src="https://storage.googleapis.com/eleven-public-cdn/audio/marketing/george.mp3"
/>

Explore our [Voice Library](https://elevenlabs.io/community) to find the perfect voice for your project.

## Quickstart

The main TTS endpoint requires:
Expand Down Expand Up @@ -77,7 +97,7 @@ Below is an example showing how to make a basic request to create an MP3 file.
</Tab>
</Tabs>

Run your function, and youll have a brand-new MP3 file containing your AI voiceover.
Run your function, and you'll have a brand-new MP3 file containing your AI voiceover.

<CardGroup cols={2}>
<Card title="Audio Quality" icon="music">
Expand All @@ -92,7 +112,7 @@ Run your function, and you’ll have a brand-new MP3 file containing your AI voi
ElevenLabs supports more than 3,000 voices across 32 languages, including:
<ul>
<li>Default curated voices</li>
<li>Voice Library of shared Clones</li>
<li><a href="https://elevenlabs.io/community">Voice Library</a> of shared Clones</li>
<li>Instant Voice Cloning</li>
<li>Voice Design for new voices</li>
</ul>
Expand All @@ -101,7 +121,7 @@ Run your function, and you’ll have a brand-new MP3 file containing your AI voi

<CardGroup cols={2}>
<Card title="Streaming Real-Time Audio" icon="circle-play">
For interactive or time-sensitive applications, stream audio as its being generated. This reduces wait times, so users can hear the speech almost instantly.
For interactive or time-sensitive applications, stream audio as it's being generated. This reduces wait times, so users can hear the speech almost instantly.
</Card>

<Card title="Output Formats" icon="file-audio">
Expand Down Expand Up @@ -165,9 +185,46 @@ Run your function, and you’ll have a brand-new MP3 file containing your AI voi

## Supported languages

ElevenLabs TTS offers full or partial support for 32 languages:
• English • Spanish • French • Italian • Polish • Portuguese • Russian • German • Japanese • Korean • Chinese (simplified) • Hindi • Arabic, and more!
Choose a “multilingual” model family for the broadest coverage.
ElevenLabs TTS supports 32 languages with our multilingual models, enabling global reach and natural-sounding localization:

<CardGroup cols={3}>
<Card title="European">
• English
• French
• German
• Italian
• Spanish
• Portuguese
• Polish
• Dutch
• Swedish
• Norwegian
• Greek
• Romanian
• Hungarian
• Bulgarian
• Croatian
• Czech
• Danish
• Finnish
• Slovak
• Ukrainian
</Card>

{' '}

<Card title="Asian">
• Chinese • Japanese • Korean • Vietnamese • Tamil • Hindi • Filipino • Malay • Indonesian
</Card>

<Card title="Other">
• Arabic
• Russian
• Turkish
</Card>
</CardGroup>

Choose a "multilingual" model family (like `eleven_multilingual_v2`) for the broadest language coverage. All our AI voices can speak any of these languages, making it easy to create consistent brand voices across multiple territories.

## FAQ

Expand All @@ -176,7 +233,7 @@ Choose a “multilingual” model family for the broadest coverage.
Absolutely. Our voice settings include parameters like Stability, Similarity, and Style
Exaggeration. Keep stability high for steadier speech, lower it for more emotional variance.
</Accordion>
<Accordion title="Can I clone my own voice or a specific speakers voice?">
<Accordion title="Can I clone my own voice or a specific speaker's voice?">
Yes. Instant Voice Cloning quickly mimics another speaker from short clips. For high-fidelity
clones, check out our Professional Voice Clone with advanced training on larger datasets.
</Accordion>
Expand All @@ -192,7 +249,7 @@ Choose a “multilingual” model family for the broadest coverage.
The TTS engine can be nondeterministic. For consistency, use the optional seed parameter, though
subtle differences may still occur.
</Accordion>
<Accordion title="Whats the best practice for large text conversions?">
<Accordion title="What's the best practice for large text conversions?">
Split long text into segments and use streaming for real-time playback or partial saving to
disk. For smoother transitions, leverage "previous_text" or "previous_request_ids".
</Accordion>
Expand Down

0 comments on commit 88a6b38

Please sign in to comment.