Podcast recording and editing platform Podcastle has joined other companies in the AI-powered text-to-speech race by releasing its own AI model called Asyncflow v1.0. Developer APIs are also available, allowing you to integrate text-to-speech models directly into your app.
Thanks to the new model, the company is able to provide over 450 AI voices that can narrate your text. The startup said it has developed its technology and models to have a lower training and inference costs and an advantage over its competitors.
With this move, Podcastle has joined many startups, including ElevenLabs, Speedify and Wellsaid, developing technology and AI models to convert all kinds of text into voice clips with AI narrated content. The technology spans use cases such as marketing, advertising, content creation, education, and corporate training.
Podcastle founder Arto Yeritsyan told TechCrunch that the company has always wanted to build speech models from text, but the training costs and data requirements are very high.
“We wanted to build a robust text-to-speech model since its inception. However, the cost of development was extremely high. Thanks to the recent development of large language models, we were able to reach the breakthrough last year and reach a place where we could build high-quality speech models without the need for a large amount of data,” says Yeritsyan.
The company also supported its efforts last year with a $13.5 million Series A funding.
Yeritsyan said that Podcastle charges about $40 for each 500-minute text-to-speech conversion, while ElevenLabs charges $99 for the same.
Podcastle’s voice cloning capabilities have also been upgraded to create a faster process for training.
Previously, the training process involved reading about 70 different sentences. Now you need a few seconds of recording from you to clone your voice. The new process also used Podcastle’s Magic Dust AI, which was released last year, to improve the quality of audio recordings.

In our tests, the voices created in the new process sounded a bit robotic, but they mimicked our tone. The company said its functionality will improve over time. Additionally, different samples of voice can be trained to achieve a variety of results.
Podcastle said that apart from costs, putting tools for audio, video, podcasts and AI-powered narration under one redesigned site will give you an edge over its competitors. Yeritsyan said that most users use podcastles to work on audio content, but videos are catching up to it too.