Like many people, I am in favour of better data on the higher education sector. But while this call unites a lot of people, there is remarkably little thinking that goes into the question of how to achieve it. This is a problem, because unless we arrive at a better common understanding of both the cost and the utility of different kinds of data, we are going to remain stuck in our current position.
First, we need to ask ourselves what data we need to know versus what kinds of data it would be nice to know. This is of course, not a value-free debate: people can have legitimate differences about what data is needed and what is not. But I think a simple way to at least address this problem is to ask of any proposed data collection: i) what questions does this data answer? And ii) what would we do differently if we knew the answer to that question? If the answer to either question is vague, maybe we should put less emphasis on the data.
In fact, I’d argue that most of the data that institutions and government are keen to push out are pretty low on the “what would we do differently” scale. Enrolments (broken down in various ways), funding, etc – those are all inputs. We have all that data, and they’re important: but they don’t tell you much about what’s going right or going wrong in the system. They tell you what kind of car you’re driving, but not your speed or direction.
What’s the data we need? Outputs. Completion rates. Transitions rates. At the program, institutional and system level. Also outcomes: what happens to graduates? How quickly do they transition to permanent careers. And do they feel their educational career was a help or a hindrance to getting the career they wanted? And yeah, by institution and (within reason) by program. We have some of this data, in some parts of the country (BC is by far the best at this) but even here we rely far too heavily on some fairly clumsy quantitative indicators and not enough on qualitative information like: “what do graduates three years out think the most/least beneficial part of their program was?”
Same thing on research. We need better data on PhD outcomes. We need a better sense of the pros and cons of more/smaller grants versus fewer/larger ones. We need a better sense of how knowledge is actually transferred from institutions to firms, and what firms do with the knowledge in terms of turning them into product or process innovations. Or, arguably, on community impact (though there I’m not completely convinced we even know yet what the right questions are).
Very few of these questions can be answered through big national statistical datasets about higher education. Even when it comes to questions like access to education, it’s probably far more important that we have more data on people who do not go to PSE than to have better or speedier data access to enrolment data. And yet we have nothing on this, and haven’t had since the Youth in Transition Survey ended. But that’s expensive and could cost upwards of ten millions of dollars. Smaller scale, local studies could be done for a fraction of the cost – if someone were willing to fund them.
There are actually enormous data resources available at the provincial and institutional level to work from. Want to find individuals who finished high school but didn’t attend PSE? Most provinces now have individual student numbers which could be used to identify these individuals and bring them into a study. Want to look at program completion rates? All institutions have the necessary data: they just choose not to release it.
All of which is to say: we could have a data revolution in this country. But it’s not going to come primarily from better national data sets run by Statistics Canada. It’s going to come from lots of little studies creating a more data-rich environment for decision-making. It’s going to come from governments and institutions changing their data mindsets from one where hoarding data is the default to one where publishing is the default. It’s going to come from switching focus from inputs to outputs.
Even more succinctly: we are all part of the problem. We are all part of the solution. Stop waiting for someone else to come along and fix it.
I would add that our approach to data publication, not just collection, is in dire need of improvement.
I imagine you chose not to mention that because ‘first things first,’ but I think broadening the discussion about the value of data to new audiences helps build the case to do better on the collection side. Better data are essential for better policy-making (and maybe that should be reason enough to improve things), but good data are also of enormous interest to prospective students and their families; I’m certain your hypothetical “what do graduates three years out think the most/least beneficial part of their program was” data would be extraordinarily helpful for someone trying to plan their educational and career pathway.
Much of the data we collect aren’t made public, or are effectively available only to those with the wherewithal and time to dig for it, parse out all the various bits and piece from different sources, then put it all back together in a format that can give a cohesive, comparative picture. Centralized, standardized, and public reporting of the data we collect is something we desperately need. This is the conversation we’ve been having with universities and decision-makers this year.
tl;dr information is useful to everyone if it’s made accessible, and that’s all the more reason we should do better at collecting it.