Building a Telegram News Bot in Python

In this article we are going to build a fully-working Telegram bot for searching news from thousands of sources all over the web.

...
Vasyl Teliman
...
Our new Telegram bot

Introduction

Telegram is a popular messaging app. It is well-known for its security and efficiency. Apart from being able to send messages to each other, its users can also create bots to automate certain routine tasks.

In this tutorial, we will use Telegram's bot API to create a news bot using Datanews API. We will use Python programming language, as it provides convenient abstractions over the bot API.

Telegram Bot API overview

We will use python-telegram-bot wrapper around the official API. This library significantly simplifies the job of the programmer when writing a bot. It is always easier to learn something new by reading through a couple of examples. Here is one:

from telegram.ext import Updater, CommandHandler


USAGE = '/greet <name> - Greet me!'


def start(update, context):
  update.message.reply_text(USAGE)


def greet_command(update, context):
  update.message.reply_text(f'Hello {context.args[0]}!')


def main():
  updater = Updater("TOKEN", use_context=True)
  dp = updater.dispatcher 

  # on different commands - answer in Telegram 
  dp.add_handler(CommandHandler("start", start)) 
  dp.add_handler(CommandHandler("greet", greet_command)) 

  # Start the Bot 
  updater.start_polling() 
  updater.idle() 


if __name__ == '__main__': 
  main()

This small piece of code creates a bot that recognizes two commands:

  1. /start - the bot will respond with the help page to this one.
  2. /greet - this command receives an argument (e.g. Datanews) and responds with Hello Datanews!.

Let's go through each line in detail and discuss what this code does.

We start with the main function:

def main():
  updater = Updater("TOKEN", use_context=True)
  dp = updater.dispatcher

  dp.add_handler(CommandHandler("start", start))
  dp.add_handler(CommandHandler("greet", greet_command))

  updater.start_polling()
  updater.idle()

This function sets up all the necessary machinery needed for our bot to work. Particularly, it creates an instance of the Updater class. Note that you need a Telegram token to be able to use Telegram bot API. Check out the official guide on how to create bots here.

Back to the code! The purpose of the Updater is to deliver updates (e.g. messages sent by users) to Dispatcher. When the latter receives an update, it tries to dispatch some of the user-specified callbacks to handle it. Each of those callbacks is managed by some handler.

You can think of a handler as a function to handle an update that is only executed when some condition is met. The condition in question, though, depends on the handler and can be specified by the programmer. In our case, we have two instances of CommandHandler class.

dp.add_handler(CommandHandler("start", start))
dp.add_handler(CommandHandler("greet", greet_command))

Each of them handles a particular command, supported by our bot - /start and /greet respectively.

Then we call start_polling method.

updater.start_polling()

This will make our bot periodically fetch updates from the Telegram server. This method will internally create two threads: one will poll updates from the Telegram server, the other one will be used by the dispatcher to handle those updates.

The next line makes sure our bot correctly handles various interruption signals (e.g. SIGINT).

updater.idle()

This is required when we want our bot to have persistent state. You can learn more about it and other cool library features in their wiki.

Let's now discuss two callback functions that handle bot's commands:

def start(update, context):
  update.message.reply_text(USAGE)


def greet_command(update, context):
  update.message.reply_text(f'Hello {context.args[0]}!')

Each of these functions takes two arguments:

  1. update - an update received by our bot from Telegram servers.
  2. context - contains various useful methods and information. For example, it has user_data dictionary which can store various user-related information.

Additionally, each of these methods sends a text message back to the user.

You can check out more elaborate examples of Telegram bots in the library's official repo.

Let's now move on to the main topic of our discussion.

Datanews API overview

Datanews provides API for retrieving and monitoring news from more than a thousand different newspapers, news aggregators and other websites. We collect and process more than 100k news articles a day. Naturally, we provide a flexible and easy-to-use API for querying those articles. For our small project, though, we only need a small part of that API. Particularly, we want our bot to be able to:

  1. Retrieve articles based on a query string, sent by the user.
  2. Retrieve articles from some particular publisher.

These use-cases can be handled by a single end-point - /headlines. You can learn more about the provided API in the official documentation.

Now we can go straight to the implementation of our bot.

Implementation

First of all, let's define a callback that handles the /start command.

def get_usage():
  return '''This bot allows you to query news articles from Datanews API.

Available commands:
/help, /start - show this help message.
/search <query> - retrieve news articles containing <query>.
  Example: "/search covid"
/publisher <domain> - retrieve newest articles by publisher.
  Example: "/publisher techcrunch.com"'''

def help_command(update, context):
  update.message.reply_markdown(get_usage())

As you can see, the implementation closely resembles our bot example above - we simply return the help information to our user. You can notice that our bot will support four commands. The help_command function implements the first two of them. Let's now discuss the other two.

def search_command(update, context):
  def fetcher(query):
    return datanews.headlines(query, size=10, sortBy='date', page=0, language='en')
  _fetch_data(update, context, fetcher)


def publisher_command(update, context):
  def fetcher(query):
    return datanews.headlines(source=query, size=10, sortBy='date', page=0, language='en')
  _fetch_data(update, context, fetcher)

These functions look very similar. They both use the /headlines API endpoint as discussed earlier (we are using the official Datanews library for Python here). They both delegate their work to a helper _fetch_data. The only difference is in the arguments we pass to the Datanews API: search_command retrieves articles matching a certain query whereas publisher_command fetches all articles as long as they are published by a specific source. Note, however, that in both cases we only get the first 10 most recent articles.

Let's now take a look at the helper that does all the job.

def _fetch_data(update, context, fetcher):
  if not context.args:
    help_command(update, context)
    return

  query = '"' + ' '.join(context.args) + '"'
  result = fetcher(query)

  if not result['hits']:
    update.message.reply_text('No news is good news')
    return

  last_message = update.message
  for article in reversed(result['hits']):
    text = article['title'] + ': ' + article['url']
    last_message = last_message.reply_text(text)

This function simply checks that the user has indeed specified required arguments to the command, fetches the data from the Datanews API and sends it in reverse order to the user. A couple of comments here:

  1. We make sure to surround a query with " so that Datanews returns all articles containing the complete query and not just a single word from it. You can learn more about the query syntax in the documentation.
  2. We also make sure to handle the case when no articles were found - it wouldn't be good to just sit silently in this situation.
  3. We send all articles in reversed order so that the last one received is the most recent.

With this out of the way, let's take a look at the main function.

def main():
  updater = Updater(token='TOKEN')

  updater.dispatcher.add_handler(CommandHandler('start', help_command))
  updater.dispatcher.add_handler(CommandHandler('help', help_command))
  updater.dispatcher.add_handler(CommandHandler('search', search_command))
  updater.dispatcher.add_handler(CommandHandler('publisher', publisher_command))

  updater.dispatcher.add_handler(
    MessageHandler(
      Filters.text & Filters.regex(pattern=re.compile('help', re.IGNORECASE)),
      help_command
    )
  )

  updater.start_polling()
  updater.idle()

This function is very similar to the one from the example. The only major difference is in the following lines:

updater.dispatcher.add_handler(
  MessageHandler(
    Filters.text & Filters.regex(pattern=re.compile('help', re.IGNORECASE)),
    help_command
  )
)

The MessageHandler is used to catch messages sent by the user. You can think of it as a CommandHandler on steroids: it processes any messages that satisfy a specified filter. In our case, we want to print help information every time the user sends a text message containing the help word.

That's it. Now you have a fully functional news bot.

...
Bot demo

Conclusion

Well, that was fun! We discussed the Telegram bot API and its implementation in Python. We also gave a brief overview of Datanews API and built our own news bot that uses it. However, this is only a tip of the iceberg: we can add support for news monitoring and many other cool features to our bot as easy as we just did. Hopefully, I managed to convince you that using Datanews API is not harder than using Telegram bot API for Python.

You can check out the source code here and the working example at https://t.me/realDatanewsBot.

...
Vasyl Teliman

Get our stories delivered

From us to your inbox weekly.