Project Gutenberg puts 5,000 audiobooks online for free using synthetic speech
Reading Time: 3 minutesOpen book repository Project Gutenberg has turned thousands of its titles into audiobooks practically overnight using synthetic speech, available now for download or streaming on multiple services. The selection is a bit idiosyncratic (as indeed the archive’s is generally) but it is nevertheless a powerful demonstration of accessibility in literature.
Making an audiobook via traditional narration naturally takes quite a long time even in the best case, and of course the reader must be paid for their time and there is the matter of editing and publishing. For many titles it doesn’t make sense financially to produce an audiobook, meaning many older and more obscure titles remain difficult for people who prefer that format to consume.
Project Gutenberg is, of course, dedicated to promulgating public domain literature in as many formats as possible, and filling this gap has likely been on their to-do list for years. But it was only when they teamed up with MIT and Microsoft that they were able to perform the kind of code magic necessary to use AI-generated speech to bring these books to life.
The problem with PG’s archive, as valuable as it is, is that the files are not uniformly formatted. They come from various sources, often error-ridden optical character recognition processes, and often are imperfectly edited and corrected by volunteers. Even if they were flawless, it does not follow that the format would be easily read by a machine: you would end up narration of page numbers, footnotes, and other ephemera.
‘Each one of the e-books in Project Gutenberg is in its own idiosyncratic html format with lots of text you wouldn’t want to hear read aloud like tables, contents, indices, page numbers etc. The hardest part of the project was extracting the good text to read aloud.’ explained project co-lead Mark Hamilton, affiliated with Microsoft and MIT.
To solve this, they designed a system that worked through the archive and identified book files that were formatted similarly, then figured out which of those clusters were the best suited to being automatically read out.
This first batch, being somewhat constrained in its selection, is a little idiosyncratic: for instance, there is only one Dickens book (the unfinished ‘Edwin Drood’ at that) but a dozen volumes along the lines of ‘Notes and Queries, Number 176, March 12, 1853 A Medium of Inter-communication for Literary Men, Artists, Antiquaries, Genealogists, etc.’
‘We picked the books for the first batch based on what we felt the automated parser could do reasonably well,’ Hamilton continued. ‘Nevertheless, some key good ones fell through the cracks. Now that we have the first batch out, we’re working to generalize the system to get closer to the full 60k books in a future release.’
As for the narration itself, the team has put together multiple machine learning and synthetic speech tools that have improved and become more accessible over the last few years. A few years ago it was obvious that automated audiobook production would soon arrive, and that is has — and at scale.
Here’s how the paper on the project describes their approach to making a generated audiobook engaging:
The first 5,000 or so books are available to listen to for free on Spotify, Apple Podcasts, and the Internet Archive, and the code used to create them is being documented at GitHub.
Ref: techcrunch
MediaDownloader.net -> Free Online Video Downloader, Download Any Video From YouTube, VK, Vimeo, Twitter, Twitch, Tumblr, Tiktok, Telegram, TED, Streamable, Soundcloud, Snapchat, Share, Rumble, Reddit, PuhuTV, Pinterest, Periscope, Ok.ru, MxTakatak, Mixcloud, Mashable, LinkedIn, Likee, Kwai, Izlesene, Instagram, Imgur, IMDB, Ifunny, Gaana, Flickr, Febspot, Facebook, ESPN, Douyin, Dailymotion, Buzzfeed, BluTV, Blogger, Bitchute, Bilibili, Bandcamp, Akıllı, 9GAG