Sunday, April 3, 2016

How to train undergraduate psychologists to be post hoc BS generators

Teaching undergraduate psychology is difficult for a variety of reasons. Students come in with preconceived notions about what psychological research is and are sometimes disappointed with the mismatch between their preconceptions and reality. Much of what psychologists do is highly specialized and requires skills that are difficult to teach, and psychologists-in-training can't offer much research-wise until they have years of experience. The assignments we ask undergraduates to complete are meant to train their critical thinking skills to prepare them for a more substantive contribution to research. Sometimes, however, they do exactly the opposite; instead, assignments can reward post hoc BS generation rather than actual critical thinking.

If the recent so-called crisis in psychology has highlighted anything, it is the prevalence and danger of post hoc narratives. Although statistical practices (e.g., use if significance testing) have gotten much of the blame -- at least in my corner of the research world -- the main problems are actually a level or two above that. Combining ill- or flexibly-defined theoretical concepts, post hoc reasoning, and publication bias yields a potent mixture that I would argue is responsible for the crisis.

I have been thinking about this in the context of assignments that we ask our undergraduates to do, and how we actually train post hoc reasoning early on. Here I'll offer examples of two undergraduate assignments that I think reward scientists-in-training for their BS generation skills. I'll also elaborate on what I think we can do about this.
The "Texas Sharpshooter" paints the target around his bullet holes.

Example assignment: Critique a peer-reviewed article

The assignment: Students are assigned an article from a peer-reviewed psychological journal and asked to critique it. Ideally, they develop choose a few critiques for which they argue for in their essay.

The basic problem with this assignment is that students are not particularly well-versed in any particular psychological topic, nor in psychological research methods. On the other hand peer-reviewed articles have been reviewed by people who are, which means that whatever problems remain with the research have evaded skilled reviewers. This is not to say that peer-reviewed research does not have major problems, but it does mean that students who have had only a few basic courses and do not have much experience in reading peer-reviewed research are unlikely to be able to find good quality-critiques spontaneously.

Upon reading such an article and having difficulty find a critique, a student is in an awkward position: they must write an essay. So what do they do? They come up with whatever critiques come to mind, which are likely to be low-quality critiques. I suspect readers of this blog have experienced these sorts of critiques in student assignments: maybe there are cultural differences? The sample seems small. Are these really the best stimuli to use? They must choose a number of these arguments, and argue for them, in spite of the fact that they don't have sufficient knowledge on which to base such a critique. We're training them in the fine art of bullshit.

This is not to say that these problems don't occur in some studies. But forming a good argument why takes specialized knowledge they they don't yet have, so we get back noise from the students. And who gets the best marks for such an assignment? Students who can write clearly about things of which they have little actual understanding.

We have to ask ourselves: is it any wonder that we have a replication crisis?

Example assignment: Do an experiment and interpret the results

The assignment: Students are asked to perform a simple experiment (often in groups), analyze the data, and report the results. They must interpret the results in light of the research they've read (often primarily the textbook).

Experience doing simple experiments and analyzing the results is critical to a psychologist-in-training. But how the assignment is framed and marked is critical to whether we are training the skills we want. Students in chemistry, biology, and physics all perform easy experiments and report the results; this is as it should be.

What is different about interpreting the results of a typical psychology experiment and that of a chemistry experiment is that there are very strong reasons to expect something specific to happen in the chemistry experiment. If the psychology experiment doesn't come out as the textbook predicts, though, they must describe why that might be. There are, of course, a hundred possible reasons why this might be the case, including the possibility that the original study was wrong, statistical noise, and sloppiness in their experimental procedure.

But these explanations will not be the ones they will explore. We require students to show creativity and independent reading/thought. In an assignment like this, students know that the best way to get a good mark is to find a paper whose logic might predict the results obtained, and include a cogent argument why this might have caused the differences. The students turn in the paper, and will not test their hypothesis, of course; the argument is simply thrown in to get a better mark. The students who do the most independent reading and form the best-sounding argument will get the best mark.

This should all sound eerily familiar: we are training them in the time-honored tradition of post hoc arguments for "hidden moderators".

Fixing the problems

If we want to train good psychologists, we must be very sensitive to the skills we're actually teaching, as opposed to those we think we are teaching. The practices in the field will be a reflection of what students are taught. How might we use assignments to train critical thinking, without teaching bad practice?

Critiquing pop science

The problem with the critique of the peer-reviewed article is that students are unlikely to be able to spot the real problems with the article. This is somewhat like asking first-year sports therapists to critique a professional sports player's technique; the imperfections are simply too fine, because the professionals have been honing their craft with help for years. It would be better to ask them to critique amateur sports players' techniques, because they will have more glaring problems.

Unfortunately, there is no "amateur" peer-reviewed research. There is, however,  a lot of very bad non-peer-reviewed pop science. Psychologists-in-training would benefit from assessing bad popular science (not just popular psychology) assessing, for instance, spurious claims of causation (vs correlation), overblown effect sizes, and mismatches between what is claimed about a research in a pop article versus what was actually done. Critiquing popular science develops the similar skills as critiquing a peer-reviewed article, without the unfortunate side effect of asking students to BS their way to a good mark.

Separating critiques of method from critiques of results

Critiquing methods along side of the results leads to the unfortunate asymmetry that if an experiment yields the expected result, the methods are not critiqued, whereas if it doesn't, the students are encouraged to generate BS reasons why it might not work, with no expectation of testing those reasons. If students were asked to critique methods by themselves, then they would not be rewarded for such post hoc reasoning. Moreover, in an essay of typical length, this leaves more room to discuss reasons why the methods are problematic; for instance, if the sample size is problematic, a methods-only critique would allow space for a power analysis. In a methods+results critique, I often see critiques of sample sizes with no corresponding argument why the sample size is a problem.

Being specific about potential critiques

In whatever assignments we give to undergraduates, we should be specific about what sorts of critiques we are expecting, preferably giving a short list of possible critiques. The students will still have to read the target article, but instead of giving a shot in the dark and being forced to argue for it, students will be forced to ask, for instance, "Does this research suffer from a confound with X?", or "is this experiment sufficiently powered to detect an effect size of Z?" or "Does this DV represent a good operationalization of W?"

Perhaps, for instance, power is not a problem; they would then be in a position to argue that, yes, the experiment is sufficiently powered, instead of always (vaguely) attacking an article. Always asking for critique teaches the students that critical thinking is about dreaming up as many ways to attack an article as possible, and then forming a plausible-seeming argument around them. In contrast, being very specific about possible critiques -- that may not, in fact, turn out to be problems -- will develop critical thinking and argumentation skills better.

Wrap up

If we believe psychology is in crisis, we should look at the way we train undergraduates to see if part of the problem lies there. I think the crisis in psychology is reflected in some ways in our training. Doing better is not just about better statistical training or better open science training; it is also about ensuring a match between what we think we are teaching and what we actually teach. 


  1. By definition, if you pick a random article from the literature then the remaining problems are those that the reviewers didn't spot, but one might expect those errors to have some sort of distribution, and there will be some fun to be had in the right-hand tail of that distribution. So I don't see that there is necessarily a problem with critiquing an article, as long as it is carefully chosen to be a bad one with an interesting selection of deficiencies, some more obvious than the others.

    Perhaps a bigger problem is the Dodo bird approach to marking such assignments, in which the student who actually nails the critical problem will probably not get a much better mark than the one who merely asks vague questions about culture or sample size. But I'm not sure that this is much of a problem, because (it seems to me, but perhaps I'm just being hopeful) the majority of those in the latter category are not going on to be researchers anyway.

    1. I agree that there are articles out there that might be used for this purpose; I think, though, that these articles are hard-ish to find because they have to be really bad. If you've got to find a new one every year, that's going to be a tall order for a single lecturer.

      Even if you do find such an article, for undergraduate pedagogy, it pays to be specific about what sorts of problems you might be looking for (and to throw some issues into the mix that are not problematic, so they'll learn not to attack everything in sight).

    2. Students can't be expected to learn from critiquing real articles because it's too hard, because they don't know how to critique real articles? ;)

      I agree with most of the elements here, especially specificity, but feel it's kind of getting it backwards. There's no such thing as an article so apt for learning methodology that you can just throw it at students with an essay instruction and expect more than a small fraction to successfully re-invent the methodological wheel, or even to make meaningful progress. And even if there was, it wouldn't stay that way for very long.

      It's not the material. Students need clear expectations, and they need to see how it's done first. Not perfectly, but good enough, before they're ready to start attempting for themselves. Then, they need active and specific feedback (not "four weeks after" feedback). People who do internships get those things, so they learn to DO science, not just ABOUT science. Many professors, in my experience, deny students that basic, critical scaffold during their classes, partly due to time constraints, but also deliberately in pursuit of some vague ideal of teaching them about "thinking for yourself." (It's never quite clear exactly what is meant by this, but by golly, its status as a universal good is beyond reproach and I am offended you'd even think to question it.)

      This is a massive mistake. I sure know I had to learn from listening directly to and interacting with people like you pick apart papers, concepts and ideas, and try to understand your reasoning, before I was ready to try for myself. Although, of course, the quality of my learning is arguable.

  2. Thank you for this! I would have loved to see this post as an undergraduate, speaking as a rather good 'can write clearly about things of which they have little actual understanding' type of student, or at least a discussion on the subject during my studies. One of the benefits of these 'critique' assignments has, for my generation, been promoting creative writing, and some of my colleagues actually landing journalist(ic) jobs. Fixing problems was never a big part of the curricula, which is odd, and let me explain why. I'm a graduate student at a university where you can get a 'psychology' degree (MA), meaning there are no more or less scientific-focused, or practice-focused programs. Most of my colleagues, and any of the university's critics would probably call for more practical knowledge in terms of work-related practice, especially since the statistical and methodological subjects represent a fair amount of the program, which seems pretty 'scientific'. But here's the catch: important issues like these are never discussed with most of the students, since it's expected that only a few (if so) will want or get to the doctorate program, and eventually probably stay as employees. These issues are tackled at the doctorate level, and the rest of us only have to 'survive' writing our one thesis, with most of us ending up in areas we were never interested in - and I have to say - with work bringing little to none value to the 'real' scientific body of knowledge. With the risk of sounding like a victim here, it seems that this is just another demotivating mechanism for us 99% of non-doctorate students. I have to agree with Chistian H. on learning by listening to people like you, and the internships part, only in my case it's the stay-as-TA's people. I'm fortunate in having a mentor for my thesis with similar interests, so I do get great and prompt feedback, plus he teaches a couple of classes which are basically online discussion forums, so there are some improvements. Still, it was clear from day one - I was never going to stay as a TA, so there's really no point in bothering me with all that extra knowledge.

  3. This comment has been removed by a blog administrator.

  4. It’s a shame you don’t have a donate button! I’d certainly donate to this brilliant blog! I suppose for now i’ll settle for book-marking and adding your RSS feed to my Google account. I look forward to fresh updates and will talk about this blog with my Facebook group. Chat soon!
    Video player for Mac

  5. I once had the "luck" of being made co-author on a very bad paper, and for reasons beyond the scope of this reply I decided not to retract my authorship. This enabled me to give students that paper, with the mission of finding at least 3 glaring errors, and teaching them to do so even if the teacher is one of the co-authors.

  6. Really your article was full of quality word with good structure thanks for share it critiquing qualitative research.