I’m very jealous of London’s transport data. Specifically, I’d like to have the following datasets for Birmingham, Leeds, Manchester, Sheffield, Newcastle, Bristol and Liverpool.
There’s lot of even cooler data that you can get from TfL, like the 5% cut of all Oyster card journeys dataset. but I’m a reasonable person. I’ll settle for the basics for now.
Me and lots of other people in cities across England would like to have this data so that we can do things like,
I don’t have any of the above data for any of the cities that I’ve listed. The data doesn’t even exist. We have some estimates, in some places — the result of small samples and surveys. But I’m not aware of any data like that which exists for London. Nottingham may be the exception, but it’s a small city.
This isn’t because England’s cities don’t care or are incompetent. They do care and they have good people. But UK national law makes it almost impossible to collect and release this data outside of London , which enjoys a special exemption from the 1985 transport act.
For the past two years, at the same time as I’ve been moaning online, I’ve been trying to do something about this. I can’t tell you who I’ve been working with, or what our difficulties have been — that would put the project at risk. But I can tell you that we’re getting quite close to releasing a lot of interesting data.
Please don’t get too excited, the data won’t be anywhere near as good as what’s available for London. For example, we can only share details about bus journeys, and only bus journeys that are free, and we can’t tell you exactly when or where a person gets on a bus, or what direction the bus is going. Oh and the bus operator names won’t be what you’re used to, and the bus routes might have different names to what you’re used to.
But, barring a sudden legal challenge, there will be data and it will let you ask interesting questions. Questions like,
You should be able to see how the answers to those questions has changed over the past three years too.
The problem we’ve faced is that in a deregulated bus market the majority of journey data is commercially sensitive. That means that it can’t be shared without a huge legal battle. So we’ve had to jump through lots of hoops to make the data worse so that it can be released. And the slice that we're releasing isn't representative of all bus use, so we can't use it for much.
But what we’ve done proves that data of a similar quality to what’s available for London might be possible to release in the future, especially if our cities are able to regulate their public transport. And it proves that we have the processes and the skills to release such data in North England.