ralphm.net

ralphm's blog

Sunday, 5 October 2003

Publish/Subscribe for RSS

News. Instantly.

On Jabber Architecture hildjj talks about Publish/Subscribe for RSS, brought to his attention by Byian and Jeremy.

He argues that using Publish/Subscribe over Jabber would be far better than existing SOAP or HTTP solutions for getting from pull to push. I couldn't agree more. One of the reasons I built Mimír is that I don't like polling. In this case I was polling news sites myself to see if anything new was happening. I wanted to be notified of new news.

So I tried several different Jabber enabled RSS aggregators, but I didn't like that they always send me messages, even when I am not online. But that is besides the point. These Jabber enabled RSS aggregators still poll news sites, albeit using RSS files. All this polling by everyone and his mother can't be efficient. Ever.

The editor of a news item could instead publish his item to all interested parties. Because an editor can't be bothered with who wants the current item, we need to do adjust this idea a bit. Fortunately there is a nice variant on the publish/subscribe design pattern. This involves having an entity that keeps a list of subscriptions to a certain topic and sends off a notification for each publish to every subscriber. Yes, this sounds a lot like a moderated mailing list.

JEP-0060 caters for a generic framework for doing publish/subscribe over Jabber and has very nice properties to do this. It allows you to set up new topics (called nodes), let people (un)subscribe to it, appoint publishers, let people query to collect recent items, let people configure their own subscriptions, etc. Check it out!

Back to the original topic. Having news being pushed to you. Mimír doesn't use pubsub to send news to end users (yet), but uses it internally. Most news channels are still polled by an aggregator (see the Architecture of Mimír). This aggregator sends its new items via pubsub to a news bot that processes it further. But this is really for backwards compatibility.

Ideally, news sources publish their news themselves using pubsub over Jabber and have the pubsub component take care of the rest. This blog has a small Python tool that publishes the most recent blog item to a pubsub node. Interested parties then receive the summary of my blog entry instantly. The news bot is really a filtering proxy of incoming pubsub notifications and sends them as regular messages to users or stores them for later reading when the subscriber is not online.

My next plan is to make a desktop client for GNOME that can process pubsub messages directly. The news bot then just passes on the pubsub message instead of converting it to a normal Jabber message. When more news sources publish their news in this way, everyone using the client gets his news instantly.

For efficiency, a web of pubsub repeaters could be created that proxy the notifications. Think of mirroring. Each administrative domain could set up his own local repeater to dramatically reduce bandwidth. Also, using the pubsub protocol, people could set up their preferences for each subscription. Like when, how and where they want to be notified.

Bryan, Jeremy and Joe. Is this what you'd want?