I am going to do something today which I expect will not put me in good stead with one of my biggest clients. But the Government of Ontario is considering something unwise and I feel it best to speak up.
As many of you know, the current Liberal government is very concerned about sexual harassment and sexual assault on campus, and has devoted no small amount of time and political capital to getting institutions to adopt new rules and regulations around said issues. One can doubt the likely effectiveness of such policies, but not the sincerity of the motive behind them.
One of the tools the Government of Ontario wishes to use in this fight is more public disclosure about sexual assault. I imagine they have been influenced with how the US federal government collects and publishes statistics on campus crime, including statistics on sexual assaults. If you want to hold institutions accountable for making campuses safer, you want to be able to measure incidents and show change over time, right?
Well, sort of. This is tricky stuff.
Let’s assume you had perfect data on sexual assaults by campus. What would that show? It would depend in part on the definitions used. Are we counting sexual assaults/harassment which occur on campus? Or are we talking about sexual assaults/harassment experiences by students? Those are two completely different figures. If the purpose of these figures is accountability and giving prospective students the “right to know” (personal safety is after all a significant concern for prospective students), how useful is that first number? To what extent does it make sense for institutions to be held accountable for things which do not occur on their property?
And that’s assuming perfect data, which really doesn’t exist. The problems multiply exponentially when you decided to rely on sub-standard data. And according to a recent Request for Proposals placed on the government tenders website MERX, the Government of Ontario is planning to rely on some truly awful data for its future work on this file.
Here’s the scoop: the Ministry of Advanced Education and Skills Development is planning to do two surveys: one in 2018 and one in 2024. They plan on getting contact lists of emails of every single student in the system – at all 20 public universities, 24 colleges and 417 private institutions and handing them over to a contractor so they can do a survey. (This is insane from a privacy perspective – the much safer way to do this is to get institutions to send out an email to students with a link to a survey so the contractor never sees the names without students’ consent). Then they are going to send out an email to all those students – close to 700,000 in total – offering $5/per head to answer a survey.
Its not clear what Ontario plans to do with this data. But the fact that they are insistent that *every* student at *every* institution be sent the survey suggests to me that they want the option to be able to analyze and perhaps publish the data from this anonymous voluntary survey on a campus by campus basis.
Now, one might argue: so what? Pretty much every student survey works this way. You send out a message to as many students as you can, offer an inducement and hope for the best in terms of response rate. Absent institutional follow-up emails, this approach probably gets you a response rate between 10 and 15% (a $5 incentive won’t move that many students) Serious methodologists grind their teeth over those kinds of low numbers, but increasingly this is the way of the world. Phone polls don’t get much better than this. The surveys we used to do for the Globe and Mail’s Canadian University Report were in that range. The Canadian University Survey Consortium does a bit better than that because of multiple follow-ups and strong institutional engagement. But hell, even StatsCan is down to a 50% response rate on the National Graduates Survey.
Is there non-response bias? Sure. And we have no idea what it is. No one’s ever checked. But these surveys are super-reliable even if they’re not completely valid. Year after year we see stable patterns of responses, and there’s no reason to suspect that the non-response bias is different across institutions. So if we see differences in satisfaction of ten or fifteen percent from one institution from another, most of us in the field are content to accept that finding.
So why is the Ministry’s approach so crazy when it’s just using the same one as every one else? First of all, the stakes are completely different. It’s one thing to be named an institution with low levels of student satisfaction. It’s something completely different to be called the sexual assault capital of Ontario. So accuracy matters a lot more.
Second, the differences between institutions are likely to be tiny. We have no reason to believe a priori that rates differ much by institutions. Therefore small biases in response patterns might alter the league table (and let’s be honest, even if Ontario doesn’t publish this as a league table, it will take the Star and the Globe about 30 seconds to turn it into one). But we have no idea what the response biases might be and the government’s methodology makes no attempt to work that out.
Might people who have been assaulted be more likely to answer than those who did not? If so, you’re going to get inflated numbers. Might people have reasons to distort the results? Might a Men’s Rights group encourage all its members to indicate they’d been assaulted to show that assault isn’t really a women’s issue? With low response rates, it wouldn’t take many respondents to get that tactic to work.
The Government is never going to get accurate overall response rates from this approach. They might, after repeated tries, start to see patterns in the data: sexual assault is more prevalent in institutions in large communities than in small ones, maybe; or it might happen more often to students in certain fields of study than others. That might be valuable. But if the first time the data is published all that makes the papers is a rank order of places where students are assaulted, we will have absolutely no way to contextualize the data, no way to assess its reliability or validity.
At best, if the data is reported system-wide, the data will be weak. A better alternative would be to go with a smaller random sample and better incentives so as to obtain higher response rates. But if it remains a voluntary survey *and* there is some intention to publish on a campus-by campus basis, then it will be garbage. And garbage data is a terrible way to support good policy objectives.
Someone – preferably with a better understanding of survey methodology – needs to put a stop to this idea. Now.