A date with data.gov.uk
By Carrie Bishop • Oct 2nd, 2009 • Category: FeaturesI might have known Dominic would be responsible for the surprise email I got the other day asking me along to a blind date with the new data.gov.uk site.
The data site is a place where all the government’s data sets will eventually live, accessible to developers and others who want to use the data, in standard formats. There are about 1100 data sets on there at the moment - not all of them are formatted for use (most of them are still in Excel spreadsheets or similar) but at this early stage about 20 data sets have been converted to useable formats, including XML and JSON. It apparently takes about two or three days to go through this conversion process, though I imagine that the more it’s done the quicker it will get.
What’s mind-blowing is that no one has any idea how much data there is. There might be tens or hundreds of thousands of sets, or there might not be that much - after all, Andrew Stott pointed out that some obvious stuff doesn’t even exist, like the locations of every post box for example. On the other hand there is potentially loads of data made selectively available to specific groups in secure systems, all of which need to be opened more widely, under Crown Copyright.
It’s not exactly clear in what order the data will be made public on the site, or how its release will be prioritised. If developers request specific data sets their requests will be passed on to the relevant departments, but there’s no indication yet of how this will be managed. If a bunch of developers are clamouring for something specific will this be prioritised, or is it first come first served? Or will it just be the easiest stuff that gets published first? Or will data be released in alignment with the policy agenda du jour? I think some clarity over this would be helpful, and I’m in favour of a UserVoice type approach, in which developers can vote data sets up or down the priority list.
The impetus for all this seems to have come from Tim Berners-Lee (though any one of us could have made the suggestion if only Gordon had asked us) and from the government’s perspective there are four main reasons for doing it:
- Making the government transparent and accountable
- ‘Citizen empowerment’ (a concept I hate)
- The economic and social value of free data
- The UK leading the world in the next phase of web development - the ‘web of data’
Tim Berners-Lee’s remit is to help the government open up its data and then make recommendations as to how the rest of the public sector, including local government, can do the same. One of the difficulties with opening up local authority data is that all the councils would have to agree to some data standards and that getting that agreement might be hard.
There was a lot of talk about ‘culture’ (i.e. behaviour) and the difficulties of moving from a convention of keeping data secret to the exact opposite as the default position. The reasons departments have given for keeping data under wraps have been many and varied - from worries about job security (if a civil servant has time to convert data formats then they obviously aren’t busy enough) to worries about what the data will reveal, through to worries that data will be taken out of context if it can be used by people who aren’t statisticians.
All the same arguments we’ve heard in relation to adopting the social web, and symptomatic of the way hierarchical bureaucracies work. Fortunately the team at the Cabinet Office is training civil servants in the conversion of the data and is also working with the National Archives and the COI and bringing in expertise from outside to spread the skills and behaviour needed to make open the default, which sounds promising.
What’s needed next, to push the behaviour change further, is to show that developers are starting to use the data to do cool stuff that meets a need. Examples of the great applications that can be built using the available data will be crucial for showing civil servants that releasing data is a Good Thing. Departments are also being encouraged to showcase any good applications on their department websites and they will host the data that appears on data.gov.uk themselves so that they retain ultimate control of it.
It’s very early days and I was impressed with the soft-launch approach of the whole project. It seems to be starting small, concentrating on the easier wins with a view to building up some evidence and examples of successes to help the project gather pace over time. Involving the developer community is smart and as long as views are genuinely taken on board it will create a firm foundation.
There are plenty of potential sticky bits up ahead - not least of which is scalability. Then there are the processes around keeping data reliable, plus the prioritisation of data release, and making sure departments know the implications of releasing data. I’m also a bit worried about what will happen when it goes wrong. Inevitably someone will do something clever with data that MPs don’t like (maybe some sort of real-time mapping of their locations, or some uncovering of conflicts of interest) and the true test will be whether the whole thing gets shut down in a predictably knee-jerk reaction. It would be awful if that happened because this has the makings of something really cool.
A final note on the data, which I just couldn’t leave unsaid: while it’s great that government data is being released, I want more. I want MY data - the stuff the government has on me. I want everything - tax payments, benefit receipts, library books borrowed, passport checks, and more besides, all in a useable format, solely available to me to use as I wish. I can see we’re miles away from that kind of thing at the moment, but I would love to see it built in as an aspiration for data.gov.uk.
Carrie Bishop is
Email this author | All posts by Carrie Bishop
[...] Carrie Bishop at FutureGov agreed that it was an impressive soft launch, and she said, “Involving the developer community is smart and as long as views are genuinely taken on board it will create a firm foundation.” To help push the project further, she said the project needs to highlight developers using the “data to do cool stuff that meets a need”. [...]
[...] how your local farming community is faring. Data.gov.uk is a very non-trivial project, and there is a long way to go, but what I was a very promising start. The early developer community is already very active, even [...]
[...] Data.gov.uk and local government data [...]