HESA

Higher Education Strategy Associates

Tag Archives: Data Collection

April 05

Student/Graduate Survey Data

This is my last thought on data for awhile, I promise.  But I want to talk a little bit today about what we’re doing wrong with the increasing misuse of student and graduate surveys.

Back about 15 years ago, the relevant technology for email surveys became sufficiently cheap and ubiquitous that everyone started using them.  I mean, everyone.  So what has happened over the last decade and a half has been a proliferation of surveys and with it – surprise, surprise – a steady decline in survey response rates.  We know that these low-participation surveys (nearly all are below 50%, and most are below 35%) are reliable, in the sense that they give us similar results year after year.  But we have no idea whether they are accurate, because we have no way of dealing with response bias.

Now, every once in awhile you get someone with the cockamamie idea that the way to deal with low response rates is to expand the sample.  Remember how we all laughed at Tony Clement when he claimed  the (voluntary) National Household Survey would be better than the (mandatory) Long-Form Census because the sample size would be larger?  Fun times.  But this is effectively what governments do when they decide – as the Ontario government did in the case of its sexual assault survey  – to carry out what amounts to a (voluntary) student census.

So we have a problem: even as we want to make policy on a more data-informed basis, we face the problem that the quality of student data is decreasing (this also goes for graduate surveys, but I’ll come back to those in a second).  Fortunately, there is an answer to this problem: interview fewer students, but pay them.

What every institution should do – and frankly what every government should do as well – is create a balanced, stratified panel of about 1000 students.   And it should pay them maybe $10/survey to complete surveys throughout the year.  That way, you’d have good response rates from a panel that actually represented the student body well, as opposed to the crapshoot which currently reigns.  Want accurate data on student satisfaction, library/IT usage, incidence of sexual assault/harassment?  This is the way to do it.  And you’d also be doing the rest of your student body a favour by not spamming them with questionnaires they don’t want.

(Costly?  Yes.  Good data ain’t free.  Institutions that care about good data will suck it up).

It’s a slightly different story for graduate surveys.  Here, you also have a problem of response rates, but with the caveat that at least as far as employment and income data is concerned, we aren’t going to have that problem for much longer.  You may be aware of Ross Finnie’s work  linking student data to tax data to work out long-term income paths.  An increasing number of institutions are now doing this, as indeed is Statistic Canada for future versions of its National Graduate Survey (I give Statscan hell, deservedly, but for this they deserve kudos).

So now that we’re going to have excellent, up-to-date data about employment and income data we can re-orient our whole approach to graduate surveys.  We can move away from attempted censuses with a couple of not totally convincing questions about employment and re-shape them into what they should be: much more qualitative explorations of graduate pathways.  Give me a stratified sample of 2000 graduates explaining in detail how they went from being a student to having a career (or not) three years later rather than asking 50,000 students a closed-ended question about whether their job is “related” to their education every day of the week.  The latter is a boring box-checking exercise: the former offers the potential for real understanding and improvement.

(And yeah, again: pay your survey respondents for their time.  The American Department of Education does it on their surveys and they get great data.)

Bottom line: We need to get serious about ending the Tony Clement-icization of student/graduate data. That means getting serious about constructing better samples, incentivizing participation, and asking better questions (particularly of graduates).  And there’s no time like the present. If anyone wants to get serious about this discussion, let me know: I’d be overjoyed to help.

April 04

How to Think about “Better Higher Education Data”

Like many people, I am in favour of better data on the higher education sector.  But while this call unites a lot of people, there is remarkably little thinking that goes into the question of how to achieve it.  This is a problem, because unless we arrive at a better common understanding of both the cost and the utility of different kinds of data, we are going to remain stuck in our current position.

First, we need to ask ourselves what data we need to know versus what kinds of data it would be nice to know.  This is of course, not a value-free debate: people can have legitimate differences about what data is needed and what is not.  But I think a simple way to at least address this problem is to ask of any proposed data collection: i) what questions does this data answer?  And ii) what would we do differently if we knew the answer to that question?  If the answer to either question is vague, maybe we should put less emphasis on the data.

In fact, I’d argue that most of the data that institutions and government are keen to push out are pretty low on the “what would we do differently” scale.  Enrolments (broken down in various ways), funding, etc – those are all inputs.  We have all that data, and they’re important: but they don’t tell you much about what’s going right or going wrong in the system.  They tell you what kind of car you’re driving, but not your speed or direction.

What’s the data we need?  Outputs.  Completion rates.  Transitions rates.  At the program, institutional and system level. Also outcomes: what happens to graduates?  How quickly do they transition to permanent careers. And do they feel their educational career was a help or a hindrance to getting the career they wanted?  And yeah, by institution and (within reason) by program.  We have some of this data, in some parts of the country (BC is by far the best at this) but even here we rely far too heavily on some fairly clumsy quantitative indicators and not enough on qualitative information like: “what do graduates three years out think the most/least beneficial part of their program was?”

Same thing on research.  We need better data on PhD outcomes.  We need a better sense of the pros and cons of more/smaller grants versus fewer/larger ones.  We need a better sense of how knowledge is actually transferred from institutions to firms, and what firms do with the knowledge in terms of turning them into product or process innovations.  Or, arguably, on community impact (though there I’m not completely convinced we even know yet what the right questions are).

Very few of these questions can be answered through big national statistical datasets about higher education.  Even when it comes to questions like access to education, it’s probably far more important that we have more data on people who do not go to PSE than to have better or speedier data access to enrolment data.  And yet we have nothing on this, and haven’t had since the Youth in Transition Survey ended.  But that’s expensive and could cost upwards of ten millions of dollars.  Smaller scale, local studies could be done for a fraction of the cost – if someone were willing to fund them.

There are actually enormous data resources available at the provincial and institutional level to work from.  Want to find individuals who finished high school but didn’t attend PSE?  Most provinces now have individual student numbers which could be used to identify these individuals and bring them into a study.  Want to look at program completion rates?  All institutions have the necessary data: they just choose not to release it.

All of which is to say: we could have a data revolution in this country.  But it’s not going to come primarily from better national data sets run by Statistics Canada.  It’s going to come from lots of little studies creating a more data-rich environment for decision-making.  It’s going to come from governments and institutions changing their data mindsets from one where hoarding data is the default to one where publishing is the default.  It’s going to come from switching focus from inputs to outputs.

Even more succinctly: we are all part of the problem.  We are all part of the solution.  Stop waiting for someone else to come along and fix it.

April 03

Data on Race/Ethnicity

A couple of week ago, CBC decided to make a big deal about how terrible Canadian universities were for not collecting data on race (see Why so many Canadian universities Know so little about their own racial diversity). As you all know, I’m a big proponent of better data in higher education. But the effort involved in getting new data has to be in some way proportional to the benefit derived from that data. And I’m pretty sure this doesn’t meet that test.

In higher education, there are only two points where it is easy to collect data from students: at the point of application, and at the point of enrolment. But here’s what the Ontario Human Rights Code has to say about collecting data on race/ethnicity in application forms:

Section 23(2) of the Code prohibits the use of any application form or written or oral inquiry that directly or indirectly classifies an applicant as being a member of a group that is protected from discrimination. Application forms should not have questions that ask directly or indirectly about race, ancestry, place of origin, colour, ethnic origin, citizenship, creed, sex, sexual orientation, record of offences, age, marital status, family status or disability.

In other words, it’s 100% verboten. Somehow, CBC seems to have missed this bit. Similar provisions apply to data collected at the time of enrolment –a school still needs to prove that there is a bona fide reason related to one’s schooling in order to require a student to answer the question. So generally speaking, no one asks a question at that point either.

Now, if institutions can’t collect relevant data via administrative means, what they have to do to get data on race/ethnicity is move to a voluntary survey. Which in fact they do, regularly. Some do a voluntary follow-up survey of applicants through Academica, others attach race/ethnicity questions on the Canadian Undergraduate Survey Consortium (CUSC) surveys, others attach it to NSSE. Response rates on these surveys are not great: NSSE sometimes gets 50% but that’s the highest rate available. And, broadly speaking, they get high-level data about their student body. The data isn’t great quality because of the response rate isn’t fabulous and the small numbers mean that you can’t really subdivide ethnicity very much (don’t expect good numbers on Sikhs v. Tamils), but one can know at a rough order of magnitude what percentage of the student body is visible minority, what percentage self-identifies as aboriginal, etc. I showed this data at a national level back here.

Is it possible to get better data? It’s hard to imagine, frankly. On the whole, students aren’t crazy about being surveyed all the time. NSSE has the highest response rate of any survey out there, and CUSC isn’t terrible either (though it tends to work on a smaller sample size). Maybe we could ask slightly better questions about ethnicities, maybe we could harmonize the questions across the two surveys. That could get you data at institutions which cover 90% of institutions in English Canada (at least).

Why would we want more than that? We already put so much effort into these surveys: why go to all kinds of trouble to do a separate data collection activity which in all likelihood would have worse response rates than what we already have?

It would be one thing, I think, if we thought Canadian universities had a real problem in not admitting minority students. But the evidence at the moment the opposite: that visible minority students in fact attend at a rate substantially higher than their share of the population. It’s possible of course that some sub-sections of the population are not doing as well (the last time I looked at this data closely was a decade ago, but youth from the Caribbean were not doing well at the time). But spending untold dollars and effort to get at that problem in institutions across country when really the Caribbean community in Canada is clustered in just two cities (three, if you count the African Nova Scotians in Halifax)? I can’t see it.

Basically, this is one of those cases where people are playing data “gotcha”. We actually do know (more or less) where we are doing well or poorly at a national level. On the whole, where visible minorities are concerned, we are doing well. Indigenous students? Caribbean students? That’s a different story. But we probably don’t need detailed institutional data collection to tell us that. If that’s really what the issue is, let’s just deal with it. Whinging about data collection is just a distraction.