Vancouver City Hall's Open Data Experiment
The transparency push is a year old and working, but you can't call it wide open, yet.
When Arthur Dent, of The Hitch-hiker's Guide to the Galaxy, searched for the notice that his house was to be demolished to make way for a new bypass, he eventually found it on the bottom of a locked filing cabinet in a disused lavatory with a sign on the door saying, 'Beware of the Leopard.'
Governments, large and small, generate vast amounts of information. Finding the relevant piece of information in a timely fashion is a challenge, and that's assuming that the government has decided to share it with the public at all.
Over the past year, Vancouver's city government has launched a program to make large amounts of information to the public. These data sets, posted online at data.vancouver.ca, include garbage pickup schedules, drinking fountains and motorcycle parking, in a wide variety of formats
Now, Mayor Robertson has gone even further, making a motion in city council that the city manager release the mayor's and council's budget and expenses, beyond the relatively scant information already available in financial disclosures.
After a year of openness, has the city lived up to its promise?
Creating new 'data maps' for the city
In May 2009, Coun. Andrea Reimer made a motion in city council to share as much data as possible with outside parties, to adopt open data formats and to put open source software on an equal footing with commercial applications. Reimer says, "I have a kind of framework in my head around engagement and how that all works, and making sure that we're all working off of... a level playing field of information is really the first place to start." This motion passed and later that year the city launched data.vancouver.ca.
One of the biggest problems with this practice is changing the flow of information with government so that, instead of a single static dataset on the web, the public dataset is updated as the information changes. Reimer says, "In the case of the public drinking fountains, for example, if a new one got added or one got subtracted, somebody [in government] would know that but the public wouldn't necessarily, because there was no way to get that information from the person from the system to the site without someone remembering to go update it. In technical language, there seems to be a lack of a data map in this city."
The current version of the data site, which has been operating since January, is better integrated into the flow of information. "Now, we've not only mapped the data but every new data set that goes up there has a structure behind it, so that we know who the person is who's doing public drinking fountains. They have a system so that every time a new one comes in or an old one goes out, and they update that information, it immediately goes live on the site as well."
The idea is that people in the private sector, both companies and private citizens, can use the public data sets on the site to create new services. This has already sparked a number of ideas and experiments in making use of the open data.
David Eaves, a policy professional and executive member of Vision Vancouver, sees open data as a way of getting people outside of government involved in solving public policy problems, by making it much easier to get the information. "Often there's a lot of so-called amateurs out there who know a tremendous amount, or professionals who do things in fields that are parallel to what you are working on. When you give them information, they often can come with very interesting analysis or a different perspective that can dramatically enhance a debate that's going on in a human issue."
For example, Bing Thom Architects used open data to study how rising sea levels would affect Vancouver's shorelines. "This isn't saying that this analysis would have been impossible before, but it would have required them to go call the city and find out who the right person is and ask if they could get the data sets, which maybe they would have gotten, but maybe they wouldn't have. Whereas now, they're working on this report and they can just download the data sets and they can create a report that helps advance this debate both for councillors and for ordinary citizens. We've dramatically lowered the transaction cost to getting a piece of work like this done."
VanTrash.ca: digitizing garbage
One practical application is VanTrash.ca, a free website that provides reminders of garbage pickups by email, based on open data from the city's website. Currently, there are about 1,600 subscribers.
VanTrash was based on an idea by Eaves posted on his blog, and picked up by two software developers, Luke Closs and Kevin Jones. Closs and Jones put together a working prototype in about 15 hours of spare time, and had the completed service operating after another 50, over two or three months of evenings and weekends. "Really not that much work," says Closs.
"The thing that the city did is they released a bunch of data and they enabled citizens to innovate. When you're trying to start a project in a large enterprise, there's so much inertia. Either you don't have time to do things or it's not quite in the clearly defined mission of the company or organization. Those small, easy, cheap, innovative things, it's hard for them to happen. It's classically called the innovator's dilemma. Whereas in a really small organization -- Kevin and I didn't have an organization other than our friendship, so we just said, 'Let's do this.'"
At the moment, VanTrash is a free service, with a request for donations. Closs estimates the site has received about $40 in donations, and he pays the roughly $300 per year in web hosting out of his own pocket. He views VanTrash as an experiment, not a business. "If this server goes down, and I'm out on a family vacation, no one's paying me to keep this up. Is there going to be 200 super angry city residents because the server went down and their email reminders didn't go out and they all forgot their garbage? It's in this weird grey area where it's a service but there's no guarantee about the level of service."
He's also considered making an iPhone or Android app so that the service would work on mobile devices, but decided against it "because we're not getting paid to do any of this, and there's lots of other exciting projects to work on, we're not really in any rush to build iPhone apps and struggle with that process."
This makes VanTrash neither a public service operated by the government, nor a commercial enterprise run by a company, but a private citizen's favour to the public.
Another issue in open municipal data is how the data is licensed and formatted. A financial statement can be released in PDF format, but the numbers would need to be manually entered into another spreadsheet or database to be analysed, which does increase the transaction costs of using municipal data.
'Can't watch city council on my own computer': councillor
The choice of data formats also influences what can be done with other media. Videos of city council sessions are posted online in streaming Flash video format, instead of as downloadable files that can be posted to video sharing sites like YouTube or edited into other video presentations. Reimer, a Firefox user, says that she can't watch the video feeds of city council meetings without switching to Internet Explorer. "It drives me totally nuts. You can't capture it. You can't put a debate up on YouTube. I can't watch the video on my own computer at work. I finally figured out that we don't have the Firefox codec and I have a hard time using Explorer." This was one of the examples she used in her initial motion.
Formats, however, are only part of the issue of transparency. The other part of the story is the terms under which the data is released.
A blog posting by open source advocate Richard Weait criticizes the data licensing used by Vancouver, Edmonton and Toronto. These include a lack of version numbers, differences that make it legally impossible to combine the data of different cities or with other data sets with different licenses, and liability to users if the city is sued. These licenses, Weait argues, make cities' data legally unusable for projects like OpenStreetMaps.org, a open alternative to Google and Bing maps.
Weait says, via email, that while he supports the open data projects of Vancouver, Edmonton and Toronto, "Municipalities shouldn't be in the Open Data License business." He argues that the data should be released under the Public Domain Dedication and License, created by legal experts in database law, which includes little or no restrictions on any use by anyone. "Make municipal license problems go away and get that data into the hands of as many potential users as possible."
"There may be another Open Data community with the Next Big Thing in data visualization, ready to make a splash with the Toronto data. Don't lock them out with a homegrown license with unintended restrictions."
What does 'open' mean?
City hall's program doesn't necessarily adhere to a strict legal definition of open data, preferring to discuss it in terms of utility to the public. Reimer says, "There is quite a debate about whether one means 'open' as in 'non-proprietary,' or 'open' as in 'commonly used and available.' A perfect example is Microsoft in the case of spreadsheets. The most commonly used standard really is Excel, so one could argue that that's an open standard. However, it is proprietary. Then you need to make it available in other formats too. What we've done is, in most of these programs you press a button and you can convert from all sorts of standards and format. If you go onto the site you'll see that it'll be available in Microsoft Excel, and it'll also be available in comma-delimited, and it might also be available in PDF. The concept was, as accessible as possible by the most number of people as possible."
Closs, as a developer, takes a pragmatic view of open data, preferring the proprietary KML format owned by Google over the open ESRI shapefile format for geographical data, because it is easier to work with.
"There's open in terms of legality but there's also open in terms of, 'Can I just look at it and inspect it? Is it a black box where I can't look inside of it, or I don't know what any of it means?' Shapefiles are binary blobs and unless you have extensive tools you can't really use them... Whereas with KML format, legally that format is controlled by Google, I believe, but I can open it up. It's just XML. I can read it with my eyes. I can copy and paste that into other things."
He adds that "'Open' is kind of a term like 'green,' where it doesn't have a solid legal definition. People use it to mean what they want it to mean."
The explosion of digital information available to the public shows that there is a significant difference between having access to something and actually controlling it. Witness CBC's use of iCopyright to require people to submit to licenses and even to pay for the right to repost some or all of CBC's articles on blogs and other sites. Government data is created by taxpayer money and is supposedly in the public domain, and yet the wrong license may hamper the data being used by the general public who funded its creation.
Read more: Science + Tech