Storing Zeitgeist data in desktopcouch

A fun game with Seif Lotfy of the Zeitgeist project today, to answer the question: how hard would it be to have my Zeitgeist event log be in desktopcouch? That makes the logs from all my computers available on all my computers. This is increasingly important as more stuff starts going into Zeitgeist — for example, we’re going to start storing lots of Ubuntu One events in there to make what’s happening with Ubuntu One on your machine more transparent and obvious, if you want to see it. So, after some conversation where Seif told me it was easy and I scoffed and bet him a beer that it wasn’t…it turned out it wasn’t that hard after all.

Zeitgeist has extensions. These aren’t brilliantly documented yet, but you can drop a Python file into .local/share/zeitgeist/extensions and if it’s got the right sort of class in it then that class will get run as a part of Zeitgeist. Extensions are great for doing things like running some code every time there’s an event which goes into Zeitgeist. /usr/share/zeitgeist/_zeitgeist/engine/extensions/blacklist.py is an example extension.

However, we didn’t do it quite that way. Because we’re taking an action on every event in zeitgeist, we don’t want to slow the core down. So instead of actually being an extension that’s built into the main Zeitgeist process, we’re an extension which launches a separate subprocess. The subprocess uses ZeitgeistClient to get notified whenever any event happens, and then serialises that event into desktopcouch. Basically, it’s an event-driven loop driven by ZeitgeistClient.install_monitor, so every time a new event happens our function gets called, and that function serialises the event into a desktopcouch Record and saves it.

The other half of the equation is getting events from desktopcouch. Obviously, if you’ve got more than one machine, then sometimes events that happened on the other machine will arrive here, and you need to pull those new records out of desktopcouch, turn then back into event objects, and push them into the Zeitgeist engine. The way to do this efficiently is by monitoring desktopcouch’s changes feed. The changes feed is a core part of CouchDB itself; the way it works is that you open an HTTP connection to it and that connection lives forever; whenever a record changes or is added or deleted to the database you’re monitoring, a line (actually, a JSON description) about the change is printed to that HTTP connection. So you just watch that feed forever, and whenever you get told “this record has changed”, you go fetch that record from desktopcouch in the normal way and then do whatever you want with it. Nicely event-driven; no polling at all, no wakeups if you don’t need them.

Getting at the changes feed from a desktopcouch database is a little more complex than getting at it from a server CouchDB, but it’s doable, and one of the things we plan to do in the Ubuntu 11.04 development cycle is make this trivial to do: you’ll just call databaseobject.glib_callback_for_changes(my_callback_function) and your callback will be called every time there’s a change in the database. (The code below contains a load of complex OAuth stuff to derive a validated URL for the _changes feed; that’s what we’re going to wrap up in that one line.)

I was pretty pleased to see how simple it is to interact with Zeitgeist, and I plan for us to work more with the Zeitgeist team. Thanks especially to Seif who talked me through a lot of this, and to whom I owe a pint or something.

desktopcouch_gateway.py; drop it in .local/share/zeitgeist/extensions, and then restart the Zeitgeist server with zeitgeist-daemon --replace.

More in the discussion (powered by webmentions)

  • (no mentions, yet.)