I'm tinkering with a podcast in my spare time, Odyssey's Place, and am generating every episode with ChatGPT4, ElevenLabs for TTS and Midjourney4 for artwork. I'm pleased with all models but am wondering if you've heard TTS engines better than ElevenLabs? Google's Tacotron sounds amazing but isn't readily available as an API. Any kind of input would be helpful. For reference here are some audio samples: https://odysseysplace.com
Thanks in advance!
The main reason why I liked it, even though the bad generations are really bad, is because you have full control of the training data set. I haven't kept up with it in a few weeks so I am sure there have been advances I'm not aware of.
https://git.ecker.tech/mrq/ai-voice-cloning