Making a News Podcast Generator

Ron Reiter
4 min readJul 27, 2019


APIs are ubiquitous. You can do just about everything you want with APIs today, which is pretty awesome, and solve any problem that used to require technology that only a few people had. I decided to solve an annoying problem I had every morning when I drive to work: I want someone to summarize the front page of Hacker News while I drive.

The thought of generating speech using a computer always was a terrible idea because it hurts my ears to listen to. However, as it turns out, Google’s newly released Wavenet-based Text-to-Speech technology is good enough to listen to for 15 minutes. And if that’s the case — then listening in to a summary of the top links can actually be practical and even enjoyable.

To do this, I wrote a Python script that does the following:

  1. Scrapes all of the daily Hacker News URLs using their open API.
  2. Summarizes them using an Article extraction API (in our case, I used Aylien, which I did not know about until I googled for an article extraction and summarization API)
  3. Uses Google’s Text-To-Speech engine on the title and summary
  4. Stitches all results into one mp3 file
  5. Uploads it to Google Cloud Storage
  6. Creates a Podcast RSS feed

So, let’s dig into how it works:

Getting the news

We start out by getting the data we want to listen to — a headline and a summary for each news item.

today =
news_file = 'news_data/news_data_%s.json' % today'getting news data...')
if not os.path.exists(news_file):
news_data = get_news_data(get_best_hn_urls(NUMBER_ARTICLES))
json.dump(news_data, open(news_file, "w"))
news_data = json.load(open(news_file))

Getting the URLs we want to scrape is done using the Hacker News API, which does not require any authentication:

def get_best_hn_urls(num=10):
top_items = requests.get(BEST_STORIES_API).json()
links = []
for item in top_items[:num]:
item_data = requests.get(STORY_API % item).json()
if 'url' in item_data:

return links



