This February, Statistics Canada will make some of its vast stores of information, including census data and CANSIM, available to the public -- not just free of charge, but largely free to do with as the public sees fit.
Embassy magazine published an article on Nov. 24, which said the agency will no longer be charging for standard online data products.
While tight-lipped about the future, StatsCan did release a statement from Gabrielle Beaudoin, director of communications:
"On February 1, 2012, self-serve standard products available on the Statistics Canada website -- including CANSIM and census data products -- will become free of charge. This will make Statistics Canada data more accessible to Canadians, organizations and businesses... Licensing restrictions for the use of Statistics Canada data products will be removed."
Media representative Peter Frayne later clarified that the data will be released under an open-license agreement. The information has few restrictions for users, and gives them intellectual rights to any value-added products they create based on StatsCan data. It bears a strong resemblance to the federal government's open-data license agreement used in the federal open-data portal pilot project, though the latter includes version information to indicate revisions.
Why you should care
Putting Statistics Canada's information under a wide open license is a great step forward for open data advocates. Some open data projects, such as the City of Vancouver's, have faltered when their data is released for no cost but under restrictive licensing that severely limits what users can do with it.
"Things are getting better everywhere on the licensing front," says David Eaves, an open-data activist and executive member of Vision Vancouver. "When the federal government launched its open-data portal about a year ago, they had a license that was quite restrictive. It had all sorts of weird clauses in it. They've slowly been chipping away at it, and now they have a license that actually is quite reasonable and measures up well against other licenses that other governments use with their data. This data will all get released under that license, so it's pretty much on par with what you'd find in the U.K., which has a good license."
He adds that other useful datasets are still prohibitively expensive to small groups or individuals, like the $10,000 database of postal code information owned by Canada Post.
Statistics Canada making portions of its data free as in beer and as in speech will have different impacts on the wide variety of public sector, private sector and NGO organizations that use it.
Michael Buda, director of policy and research for the Federation of Canadian Municipalities (FCM), says the move is welcome among local government officials. "They see this as part of a positive trend where StatsCan is increasingly seeing their role as facilitating access and use of data by Canadians and by other governments and agencies to make better planning and policy decisions. Municipalities see it as a positive trend. But in the near term, it is unlikely it will have a huge impact on the costs."
Their reservations are about the details of the data made free. Municipalities need data at the level of cities or individual neighbourhoods.
"Right now, in some cases, although the data technically exists, it's not really structured in a way to make it easy for StatsCan or for anyone to look at it at a municipal level," says Buda. Some StatsCan data is organized by census metropolitan areas, or CMAs, which are "an economic construct essentially. They don't actually correspond to government boundaries," he adds. Municipalities may need to continue ordering custom data from Statistics Canada.
Buda is also concerned about the money. By his estimate, the data Statistics Canada intends to make free brings in $15 million annually, and it isn't clear how the agency will make up that lost revenue, whether from federal government funding or from charging for other services. "Will they [Statistics Canada] be reducing the data analysis expenditures? Will other data quality be reduced or will they increase the cost of buying other data sets that you have to pay for now anyway?"
Nonetheless, Buda sees this is a positive step towards greater openness in government data. He says the FCM and a group of municipalities are working with the data agency and a number of other federal agencies to "start providing that kind of municipal and even neighbourhood-level data available at a lower cost and much more freely available."
More relevant data, delivered faster
Private sector organizations also use government information.
One example is the Nova Scotia-based company Viewpoint, which uses government information (though nothing from Statistics Canada) to provide interactive maps and information for real estate purposes.
"When people looked at a neighbourhood they were interested in buying a house in, the primary consideration usually wouldn't be demographics," says Viewpoint's CEO Bill McMullin. "But would we like to tell them what the demographics are of a neighbourhood? Sure we would. If the data was readily available in a machine readable format from Statistics Canada, we would certainly download it and use it and make it available to the public."
"There's no excuse for institutions like Statistics Canada not releasing the data, because their purpose is not to collect data for sale. Their purpose is to collect data to allow the government, the public, everybody, to better predict a sense of the future. To understand what our demands are going to be on infrastructure, by looking up population growth, et cetera. Their job is not to compile data to sell it for profit. That's the job of the private sector. It's a very dangerous line when a government institution comes close to acting like a business."
Perhaps the greatest impact of Statistics Canada's open-data policy is in the non-profit sector, where money for research can be scarce. According to Al Hatton, CEO of United Way Canada, his organization often has to borrow important research data from other organizations, instead of directly purchasing it from Statistics Canada.
"We don't have resources set aside to purchase that sort of raw material. We would get it more through other entities, think tanks and things like that, after they produce things, and we take their stats, many of which came from StatsCan, and then we would use them that way. That's not very efficient, but that's all we could afford. And if we couldn't -- probably one of the largest charities in the country and certainly one of the best endowed, and one of most independent -- if this is a struggle for us, you can only imagine what it is like for one of the other 83,000 charities in this country," says Hatton.
Getting research data faster is also important. "What we used to do was get Stats Canada data, usually vicariously through other sources, three or four years after it was available to those that were paying for it. That's not very good in terms of planning, and it takes long enough to analyze it anyway. If you have it a few months after it is collected, then actually it's much more relevant."
Research data on employment, housing, poverty and other indicators is important for the United Way, both for planning services and for demonstrating to donors the work the organization does.
As Hatton puts it, "Donors want to know, what difference did you actually make? If we can show that in a neighborhood, we have more services, we have more capacity, we have less poor people, we have more response mechanisms and more services that are actually at the disposition of the population, then we can say to donors, 'This investment you made by giving a donation to us has actually produced these results,' It's not good enough to just say, 'We helped this many people,' or 'We have these many programs,' or 'We raised this much money.' That used to work."
Statistics Canada's new policy means that, apart from the commercial opportunities, everyone will have new information resources.
"I really hope that a lot more people will use StatsCan data," Eaves says. "The great thing that will happen, at least I think, is that more people will develop tools to use this data. My hope is that we'll have more non-profits using it, and more students, more everyday people who are able to use it."