Here’s a question. Having seen the Google’s new “Priority Inbox” feature and also John Graham-Cumming’s POPFile application, both ways of using a Bayesian classifier to guess which e-mail you will want to read first and to file it automatically, I was wondering if anyone had applied the same idea to RSS. I’ve recently started to add new blogs to my reader again, and it struck me that reading them took up enough time that it might be useful to prioritise and classify them automatically. It might even be yet another project I probably won’t find the time to finish.
Searching the web, though, I was surprised to find quite a lot of similar projects that didn’t seem to have many users or for that matter to be in active development. It actually looks like this is one of the problems that almost all developers at one point or another feel the need to tackle. But nobody’s made it stick. Somebody even had their RSS feeds delivered by e-mail and used POPFile itself, but that’s silly. I can think of a couple of reasons – one is that the use case might be fundamentally flawed. If it wasn’t for surprises, the blogosphere would be pretty dull – otherwise you might just read Martin Kettle’s column or watch TV. If you could have a feed of blog posts that you were guaranteed to read, would you want to read them? Of course, you could introduce some sort of random element, perhaps promoting some proportion of the posts least likely to be read, but that would defeat the point.
One feature which I didn’t see anywhere was a social element. I could certainly see a use for an application that classified RSS items into groups, and let multiple users contribute to the same group. I mark some of the items as “Telco 2.0”, and therefore train the classifier to filter things relevant to the company into that bucket. But other T2 people have opinions about what is relevant to the company, and they might benefit from mine as well. Obviously, if we use the same classification profile we’ll get the same results – interestingly, we’ll get the same results in some sense even if we’re not all reading the same blogs. So I’d like to be able to have shared group filters.
Does anyone know of an application that does this, preferably without letting some random website see everything I read? Points for integration with other RSS readers, notably either Akregator or Firefox/Sage. I’d be OK with a web page served on localhost (or on a server I control). At the moment, this is in the lead, but it strikes me as being rather more heavyweight than is ideal.