XTTS: Open Model Release Announcement
Coqui is more than proud to announce the release of XTTS, the first generative, text-to-speech foundation model that is both open and production-quality.
The foundation model XTTS is the culmination of years of work by the Coqui team and is able to outperform both open and closed models in a broad range of tasks. For example:
- Quality - XTTS generates speech that meets and exceeds production-quality requirements.
- Multilingual - XTTS generates speech in 13 different languages (Arabic, Brazilian Portuguese, Chinese, Czech, Dutch, English, French, German, Italian, Polish, Russian, Spanish, and Turkish). More to come!
- Voice Cloning - XTTS clones any voice using only a small sample of the original voice, i.e. you give a voice sample in German and you can create a clone that sounds like the original voice speaking German.
- Cross Language Voice Cloning - XTTS clones across languages, i.e. you give a voice sample in German and you can create a clone that sounds like the original voice speaking any of the other languages (Arabic, Brazilian Portuguese, Chinese, Czech, Dutch, English, French, Italian, Polish, Russian, Spanish, Turkish).
Coqui’s innovation is not limited to only the foundation model XTTS. Coqui is also innovating in open model licensing. (Currently, open model licensing, not open source licensing, is very broken.)
Working with Heather Meeker, world-leading expert on open source licenses, Coqui has created a new, innovative model license, the Coqui Public Model License (CPML), and XTTS will be the first ever model released under the CPML! You can read more about the Coqui Public Model License (CPML) here.
Enterprises, if you’re interested in licensing the model, fine-tuned versions of the model, or other model variants under other licensing terms, just reach out [email protected].