 Introductory computer science is hard. It’s not a course most students would take as a light elective, and failure rates are high (two large studies put the average at around 35% of students failing). Yet, at the same time, introductory computer science is apparently quite easy. At many institutions, the most common passing grade is an A. For instructors, this is a troubling state of affairs, which manifests as a bimodal grade distribution — a plot of students’ grades forms a valley rather than the usual peak of a normal distribution.
Introductory computer science is hard. It’s not a course most students would take as a light elective, and failure rates are high (two large studies put the average at around 35% of students failing). Yet, at the same time, introductory computer science is apparently quite easy. At many institutions, the most common passing grade is an A. For instructors, this is a troubling state of affairs, which manifests as a bimodal grade distribution — a plot of students’ grades forms a valley rather than the usual peak of a normal distribution.
For most of the last forty years, the dominant hypothesis has been the existence of some hidden factor separating those who can learn to program computers from those who cannot. Recently this large body of work has become known as the “Programmer Gene” hypothesis, although most of the studies do not focus on actual genetic or natural advantages, so much as on demographics, prior education levels, standardized test scores, or past programming experience. Surprisingly, despite dozens of studies taking place over more than forty years, some involving simultaneous consideration of thirty or forty factors, no conclusive predictor of programming aptitude has been found, and the most prominent recent paper advancing such a test was ultimately retracted.
The failure of the “Programmer Gene” hypothesis to produce a working description of why students fail has led to the development of other explanations. One recently proposed approach is the Learning Edge Momentum (LEM) hypothesis, by Robins (2010). Robins proposes that the reason no programmer gene can be found is because the populations are identical, or nearly so. Instead of attributing the problem to the students, Robins argues that it is the content of the course that causes bimodal grade distributions to emerge, and that the content of introductory computer science classes is especially prone to such problems.
At the core of the LEM hypothesis is the idea that courses are composed of units of content, which are presented to students one after another in sequence. In some disciplines, content is only loosely related, and students who fail to learn one module can still easily understand subsequent topics. For example, a student taking an introductory history class will not have much more difficulty learning about Napoleon after failing to learn about Charlemagne. The topics are similar, but are not dependent. All topics lie close to the edge of student’s prior knowledge. In other disciplines however, early topics within a course are practically prerequisites for later topics, and the course rapidly moves away from the edges of students’ knowledge, into areas that are wholly foreign to them. The more early topics students master, the easier the later ones become. Conversely, the more early topics that students fail to acquire, the harder it is to learn later topics at all. This effect is dubbed “momentum.”
Robins argues that introductory computer science is an especially momentum-heavy area. A student who fails to learn conditionals will probably be unable to learn recursion or loops. A student who fails to grasp core concepts like functions or the idea of a program state will likely struggle for the entire course. Robins argues that success on early topics within the needed time period (before the course moves on) is largely random, and shows via simulation that, even if students all start with identical aptitude for a subject, if the momentum effect is increased enough, bimodal grade distributions will follow. However, no empirical validation of the hypothesis was provided, and no subsequent attempts at validation have been able to confirm this model. The main difficulty faced in evaluating the LEM hypothesis is that the predictions it makes are actually very similar to the “Programmer Gene” hypothesis. Both theories predict that students who do well early in a course will do well later on. The difference is the LEM hypothesis says this was mostly down to chance, while the “Programmer Gene” hypothesis says it was due to the students’ skill.
In my research project for the Certificate in University Teaching (CUT), I proposed a new method of evaluating the LEM hypothesis by examining the performance of remedial students — students who retake introductory computer science classes after failing them. The LEM hypothesis predicts that remedial classes should also have bimodal grade distributions, because student success on initial topics is largely random. Students taking the course for the second time should be just as likely to learn them as students taking the course the first time round. In contrast, the “Programmer Gene” hypothesis predicts that remedial courses should have normally distributed grades, with a low mean. This is because remedial students lack the supposed “gene”, and so will not be able to learn topics much more effectively the second time than they were the first time.
To evaluate this hypothesis, I acquired anonymized data from four offerings of an introductory computer science course: two with a high proportion of remedial students, and two with a very low proportion. I found weak evidence in support of the LEM hypothesis, as all grade distributions were bimodal when withdrawing students were counted as failing. However, when withdrawing students were removed entirely, only one non-remedial offering was bimodal, a result predicted by neither theory.
Although my empirical results were ultimately inconclusive, my research provides a clear way forward in evaluating different hypotheses for high failure rates in introductory computer science. A follow up study, conducted with data from a university that offers only remedial sections in the spring term (removing the confounding effects of out-of-stream students in the same class) may be able to put the question to rest for good, and facilitate the design of future curricula.
References:
Robins, A. (2010). Learning edge momentum: A new account of outcomes in CS1. Computer Science Education, 20 (1), 37-71.
The author of this blog post, John Doucette, recently completed CTE’s Certificate in University Teaching (CUT) program. He is currently a Doctoral Candidate in the Cheriton School of Computer Science.