When, where, and which buses do people use?
Knowing when, where, and which buses people use is important. This is especially true in the UK where, somewhat unusually for a rich country, most journeys on public transport in our big cities are by bus.
Without data on bus use we cannot understand how our cities work. And without an understanding of public transport in our hubs of sustainable economic activity, we cannot support and build for sustainable economic growth.
At ODILeeds I've been working on bus data for a long time. I write open source software that converts Great Britain's bus timetables from the UK standard TransXChange format into the de facto global standard GTFS. Together with the equivalent converter for GB rail we build a full journey planner and journey analysis system for Great Britain. This lets us model the impact of investments such as Northern Powerhouse Rail, new bus lanes, a tram system, a congestion charge, or anything else.
This work fits in well with my work tracking every bus in Birmingham to see where and when delays occur to that we can understand the effective size of the city by public transport at peak times.
The passengers are what matters.
But there is a key piece of data that we're missing. How many people use which buses when?
In all UK cities except London we don't know. The UK government releases a spreadsheet of bus use by year by local authority or integrated travel area but there is not enough detail here to understand who, when, and where is using buses.
We've worked hard in Leeds to prove that bus use data released under an open license is easy to analyse and make valuable. Our free bus pass use analysis tool makes it easy to see when people use free bus passes in West Yorkshire, and on which bus services. With additional analysis we can see approximately when and where these boardings took place too.
Beyond free bus journeys.
But this data only describes journeys made using free bus passes; a very unrepresentative section of total bus journeys. We can share this data because free bus journeys are paid for by local government, and as a result local government can share the details of those journeys. But all other journeys are made by private individuals on private bus companies, and private bus companies have no incentives to share their detailed passenger data. So they don't.
But what if we could see all bus journeys?
In London we can. This is because Transport for London (uniquely within the UK) regulates all bus use in the city and is free to share this data. In October 2015 TfL took a snapshot of all Oyster card use on their network, anonymised it, and released it as open data. We combine this with the UK's national database of public transport access nodes (NaPTAN) and a look up table provided by TfL to link the bus stop codes used in London with the bus stop codes used nationally. Loading that data into PowerBI lets us graph and map it.
For a two week period we can see a representative sample of when, where, and which buses people boarded.
You can download the raw data (which is licensed by TfL and the UK government under the Open Government License) along with a Power BI interface for exploring it in a single zip file here.
There is much more that we can do with this data on where people get on buses. We are already putting the locations into open tools such as Open Audience to analyse the demographics of bus users. The great thing about open data, released for anyone to use for any purpose without requiring permission, is that so many more valuable things can be done than I can even begin to think of.