Artificial Teaching Assistants

The "draughtsman" automaton created by Henri Maillardet around 1800.
The “draughtsman” automaton created by Henri Maillardet around 1800.

The dream of creating a device that can replicate human behaviour is longstanding: 2500 years ago, the ancient Greeks devised the story of Talos, a bronze automaton that protected the island of Crete from pirates; in the early thirteenth century, Al-Jazari designed and described human automata in his Book of Knowledge and Ingenious Mechanical Devices; in the eighteenth-century, the clockmaker Henri Maillardet invented a “mechanical lady” that wrote letters and sketched pictures; and in 2016, Ashok Goel, a computer science instructor at Georgia Tech, created a teaching assistant called Jill Watson who isn’t a human – she’s an algorithm.

Goel named his artificial teaching assistant after Watson, the computer program developed by IBM with an ability to answer questions that are posed in ordinary language. IBM’s Watson is best known for its 2011 victory over two former champions on the gameshow Jeopardy! In Goel’s computer science class, Watson’s job was to respond to questions that students asked in Piazza, an online discussion forum. Admittedly, the questions to which Watson responded were fairly routine:

Student: Should we be aiming for 1000 words or 2000 words? I know, it’s variable, but that is a big difference.

Jill Watson: There isn’t a word limit, but we will grade on both depth and succinctness. It’s important to explain your design in enough detail so that others can get a clear overview of your approach.

Goel’s students weren’t told until the end of the term that one of their online teaching assistants wasn’t human – nor did many of them suspect. Jill Watson’s responses were sufficiently helpful and “natural” that to most students she seemed as human as the other teaching assistants.

Over time – and quickly, no doubt – the ability of Jill Watson and other artificial interlocutors to answer more complex and nuanced questions will improve. But even if those abilities were to remain as they are, the potential impact of such computer programs on teaching and learning is significant. After all, in a typical course how much time is spent by teaching assistants or the instructor responding to the same routine questions (or slight variations of them) that are asked over and over? In Goel’s course, for example, he reports that his students typically post 10,000 questions per term – and he adds that Jill Watson, with just a few more tweaks, should be able to answer approximately 40% of them. That’s 4000 questions that the teaching assistants and instructor don’t have to answer. That frees up a lot of their time to provide more in-depth responses to the truly substantive questions about course content.

More time to give better answers: that sounds like a good thing. But there are also potential concerns.

It’s conceivable, for example, that using Watson might not result in better answers but in fewer jobs for teaching assistants. Universities are increasingly keen to save money, and if one Watson costs less than two or three teaching assistants, then choosing Watson would seem to be a sound financial decision. This reasoning has far broader implications than its impact on teaching assistants. According to a recent survey, 60% of the members of the British Science Association believe that within a decade, artificial intelligence will result in fewer jobs in a large number of workplace sectors, and 27% of them believe that the job losses will be significant.

Additionally, what impact might it have on students to know that they are being taught, in part, by a sophisticated chatbot – that is, by a computer program that has been designed to seem human? Maybe they won’t care: perhaps it’s not the source of an answer that matters to them, but its quality. And speaking for myself, I do love the convenience of using my iPhone to ask Siri what the population of Uzbekistan is – I don’t feel that doing so affects my sense of personal identity. On the other hand, I do find it a bit creepy when I phone a help desk and a ridiculously cheery, computerized voice insists on asking me a series of questions before connecting me to a human. If you don’t share this sense of unease, then see how you feel after watching 15 seconds of this video, featuring an even creepier encounter with artificial intelligence.

Meaningful Conversations in Minutes – Mylynh Nguyen

ConversationWith constant media stimulation, increase in competitiveness, and stress overload, “Is it possible to slow down” (1)?  Our culture can be self-driven and individualistic so it is no surprise that for many, time is a finite resource that is draining away. As a result, we try to do as much as we can in a very short time period. Our minds are filled with constant distraction, thus limiting opportunities for self-reflection to ask oneself “Am I well or am I happy?” (1).

We’d like to believe that we have been a good friend, partner, or child at various points in our life. However, upon remembering that significant person in your life, do you know or have you ever asked what were the moments when they were the happiest? The times when they were crying from tears of joys to the time when they felt the most accomplished? Surprisingly for many, we are unaware of these stories that ultimately define whom that individual has become today. We mindlessly pass every day without pondering about the conversations that we had or the connections that were made.  By simply being mindful of the questions that we pose, more specifically “questions that people have been waiting for their wholes lives to asked … because everybody in their lives is waiting for people to ask them questions, so they can be truthful about who they are and how they become what they are,” as beautifully said by Marc Pacher (2).

So what is the action plan?

1.Invite people to tell stories rather than giving answers. Instead of “How are you” substitute

  • What’s the most interesting thing that happened today?
  • What was the best part of your weekend?
  • What are you looking forward to this week? (3).

2. Enter a conversation with the willingness to learn something new

  • Celeste Headlee in her TED Talk 10 Ways to Have a Better Conversation describes how she frequently talks to people whom she doesn’t like, and with people whom she deeply disagrees yet is still able to have engaging and great conversations. She is able to do this as she is always prepared to be amazed and she seeks more to understand rather than to listen and state her own opinion and thoughts.

3. Lastly “being cognizant of [your] impact is already the first step toward change. It really does start at the individual level” my friend once said (5).

  • Brene Brown in her Power of Vulnerability talk said, “Many pretend like what we’re doing doesn’t have a huge impact on other people”. But we’d be surprise of what we are capable of when you allow yourself to be vulnerable as this “can be the birthplace of joy, of creativity, of belonging of love… the willingness to say, “I love you” the willingness to do something where there are no guarantees” (6).

That being said, you don’t have to be the most intellectual or outspoken person in the room, but what is key is the willingness to be open and the questions that are posed. There are many simple things that can be easily integrate into our daily lives, by being more mindful of the question that we ask to ultimately have a more memorable and enriching conversation. In the end it is to have better connections, new understanding and awareness to savor the moment.

At CTE, Microteaching Sessions are offered where you can choose from various topics to conduct an interactive teaching lesson. For my first topic I will be talking about the importance of communication. All participants will not only be giving feedback but will receive constructive feedback and ways to improve from knowledgeable facilitators. It’s a safe environment where you have the chance to present to fellow graduate students from various departments. Many have found these sessions beneficial as you are working on skills relevant to work, field of study or for your own personal growth. I am excited and nervous for this opportunity to talk about something I am passionate about and I hope I can successfully engage others and deliver the content well. In order to help participants formulate an effective teaching plan, the Centre for Teaching Excellence website has provided many resources such as well written guidelines, lesson plans outlines, and facilitators review the lesson before you present.

As a follow-up post, I had the chance to facilitate an hour session for an AIESEC conference for participants from various universities such as Toronto, Waterloo, Laurier, and York, that recently returned from their international exchanges. There were lots of discussion so thank you to the Graduate Instructor Developers, Charis Enns and Dave Guyadeen, and Instructional developer, Stephanie White for their great feedback and helping me make this session more successful!

Sources:

Assessing Group Work Contribution – Monika Soczewinski

skydivingDuring my post-secondary education I always had some mixed feelings when I would find out that there was group work in a course I was taking. On the one hand, I was excited at the prospect of learning with and from my peers. On the other hand – as anyone who had a poor group experience in the past – I worried that some members in my group might not be as committed and would not put in effort into the project.

Group work in the classroom has many learning benefits. Students get an opportunity to work on some more generic skills, such as working in a team, collaboration, leadership, organization, and time management, among others. These are the kinds of skills that are valued by employers, and as competitiveness in the entrance into many professions grows, it is becoming increasingly important to teach them in university.

Despite these positive points, many students (and some instructors) have mixed feelings about group work, just as I did in my classes. One major concern in group work is that some students will not contribute equally to the work within their group – a behaviour called free-riding. According to studies, free-riding was identified as one of the greatest concerns students had about group work, across faculties and disciplines (Gottschall & Garcia-Bayonas, 2008; Hall & Buzwell, 2013). Since most of the work is done in a setting where the instructor cannot observe the group dynamics, instructors might have similar concerns about free-riders. The fairness of the assessment process might be compromised if students do not contribute equally but receive the same group mark.

One solution to determine how much individual students contributed to the group project is to ask group members to assess each other, in a process of peer assessment. In this situation, peers are providing feedback on their group members’ contribution levels to the project, not assessing the actual project itself. This is a popular technique because group members are in a position where they clearly see how their peers have contributed. Students are also able to decide what kinds of contributions were valuable in their unique group setting. This can include the forms of contribution that are more difficult to quantify, such as attitude, receptivity, insightfulness, organization, etc. Each student’s final grade is then a reflection of both the whole group project, as graded by the instructor, plus the peer assessment of their contribution. Each student will then come out with a unique grade.

Final Grade = group project (marked by instructor)

+/- individual contribution level (rated by peers)

Some considerations for peer assessment of group work contribution include:

  1. Set the expectations for group work: Start off the group projects with a class discussion about the expectations for each student, and why the peer assessment of contribution is important. Students will have a better understanding of their responsibilities in the group, and will know that contribution is an important factor in their grade.
  2. Criteria of the peer assessment: The best practice is to provide students with at least some guidance or criteria to help rate their peers (Goldfinch & Raeside, 1990; Wagar & Carroll, 2012). Depending on the type of project, the instructor can ask students to rate members based on contribution to each project task, they might ask for ratings on generic skills such as level of enthusiasm, organization, etc., or a combination of the two. Whatever criteria the instructor selects, it is beneficial to involve the class in the decision.
  3. Open peer assessment versus private peer assessment: Should students have an open discussion about group member contributions, or should they rate each other anonymously? According to Wagar and Carroll (2012), students show a preference for confidential peer assessment. Having an open peer assessment can detract from the sense of collaboration, and students might be afraid of openly criticizing and offending their peers.
  4. Timing of the peer assessment: Ideally, students should be given the criteria of the peer assessment at the start of the project, and fill it in once the project is completed. This allows students to understand from the start how they will be assessed, especially if they divide the work in unconventional ways that might do not fit into the criteria. Students can also pay closer attention to contributions throughout the project, and make more accurate assessments (Goldfinch & Raeside, 1990).

Visit the CTE Teaching Tips to read more about methods for assessing group work, and other group work resources.

 

References:

Goldfinch, J., & Raeside, R. (1990). Development of a peer assessment technique for obtaining individual marks on a group project. Assessment & Evaluation in Higher Education, 15(3), 210-231.

Gottschall, H., & Garcia-Bayonas, M. (2008). Student attitudes towards group work among undergraduates in business administration, education and mathematics. Educational Research Quarterly32(1), 3-29.

Hall, D., & Buzwell, S. (2013). The problem of free-riding in group projects: Looking beyond social loafing as reason for non-contribution. Active Learning in Higher Education14(1), 37-49.

Wagar, T. H., & Carroll, W. R. (2012). Examining student preferences of group work evaluation approaches: Evidence from business management undergraduate students. Journal of Education for Business87(6), 358-362.

Notes from the Music Studio — Christine Zaza

playing pianojpgWhen I reflect on teaching and learning in higher education I realize that much of what I learned, I learned when I was a music student. Here are some of the highlights from the music studio that are just as applicable to university teaching and learning:

Practice, practice, practice. Actually, this would more aptly be phrased Practice-Feedback, Practice-Feedback, Practice-Feedback, but the rhythm just isn’t as good. I wouldn’t expect anyone to become a professional violinist without regular lessons with a qualified teacher. Regular feedback is critical to guiding students as they develop new skills. Without regular feedback, bad habits can become engrained and difficult to correct. In university, students learn a number of new skills and new ways of thinking and they need multiple opportunities to practice these skills with regular feedback. To ensure that students focus on the feedback and not just the grade, instructors can give a follow-up assignment students to make revisions highlighting how they have incorporated the feedback that they received on their first submission.

Practice the performance. When preparing for a recital or audition (a summative test), music students are advised to practice performing in front of friends, family –teddy bears if need be – several times, before the actual performance. Preparing for a performance is different from preparing for weekly lessons. Good performance preparation is crucial because in a performance you get one shot at the piece. There are no do-overs on stage. Similarly, when writing music theory or history exams, practicing the exam is an expected part of exam preparation. To facilitate this preparation, the Royal Conservatory of Music sells booklets of past exams. The Conservatory also returns graded exams so that students can see exactly where they earned and lost marks: considering that the Royal Conservatory of Music administers thousands of exams, three times a year, across the globe, this is a huge undertaking. At university, we know that self-testing is an effective study strategy and some instructors do provide several practice exams questions in their course. However, due to academic integrity concerns, the common practice is to deny students access to past exams as well as their own completed exam. I wonder if academic misconduct would be less of an issue if students were allowed to use past exams as practice tools. Amassing a large enough pool of past exam questions should address the concern that students will just memorize answers to questions that they’ve seen in advance.

Explicit instruction is key. It’s not very helpful to just tell a novice piano student to go home and practice. In the name of practicing, a novice student will, more than likely, play his or her piece over a few times, from bar 1 straight to the end, no matter what happens in between, and think that he has “practiced.” I know. I’ve heard it hundreds of times, and if you have a child in music lessons, I’ll bet you’ve heard it too. Explicit instruction means addressing many basic questions that an expert takes for granted: What does practicing look like? How many times a week should you practice? For how long should you practice? How do you know if you have practiced enough? How do you know if you have practiced well? Similarly, not all first students arrive at university knowing how to study. Many students would benefit from explicit instructions about learning and studying (e.g., What does studying look like? How do you know when you’ve studied enough? I’ve gone over my notes a few times – is that studying? Etc.

Know that students can’t learn it all at once. A good violin teacher knows that you can’t correct a student’s bow arm while you’re adjusting the left hand position, improving intonation, working on rhythm, teaching new notes, and refining dynamics. In any given lesson, the violin teacher chooses to let some things go while focusing on one particular aspect of playing otherwise the student will become too overwhelmed to take in any information at all. Suzuki teachers know that you always start by pointing out something positive about the student’s playing and that you can’t focus only on the errors. Students need encouragement. I think this is true at university as well. Becoming a good writer takes years and novice writers will likely continue to make several mistakes while at the same time improving one or two specific aspects of their writing. While giving feedback on written assignments, it’s important to acknowledge the positive aspects – that’s more encouraging that facing a sea of red that highlights only the errors.

Even if you didn’t take piano lessons as a child and even if have registered your 6 year old for hockey rather than violin lessons, I hope you’ll find these lessons from the music studio applicable to the university classroom.

 Photo privided by Samuel Cuenca under a Creative Commons license.

The first year is critical – Jane Holbrook

Students leaving campus
Who will stay?

Coming into campus on Monday morning was a shock, but a nice one. We don’t get a lot of downtime on our campus but the last two weeks of August and days leading up to Labour Day are usually pretty sleepy; many folks are on vacation and it’s hard to even find a coffee shop open. The throng that I biked into at the main gates Monday morning at 8:15 was a bit disorderly, but the excitement in the air was electric. And it’s the first year students, all fresh faced and enthusiastic, frantically looking for their classrooms and with high expectations that generate the most excitement.

The first couple of weeks of term are exciting but then, of course, the realities of a five course load, weekly assignments (lab reports, readings …) and then midterms set in and those first year students are often challenged to just make it through first term. Our IAP statistics show that our first year retention rate (percentage of students who return to second year here after first year) is close to 92% (UWaterloo IAP), well above the reported retention rate of 80%  for four-year public US institutions (see National Student Clearing house report ) and higher than most other Ontario universities where retention rates hover around 87% (CUDO – Common University Data Ontario). This isn’t the old case of “look to your right, look to your left, one of you won’t be here next year” that we were admonished with as students in years gone by, but if 1 in 10 students do not return after first year, this is a definite loss to the university community and setback for that young person.

Universities have recognized that students face a number of challenges in their first year and provide orientation programs, peer mentoring, study skills sessions and other supports to help new students handle the emotional and educational transitions that they will be experiencing. However, even with these programs in place, our instructors who teach first year courses have a critically important job ahead of them. Studies show that although a student’s personal situation (family background, economic stresses, etc.) and prior academic performance in high school affect first year retention, student engagement in this critical first year is also a major contributor to student retention (Kuh et al., 2008). Creating rich and engaging classroom experiences for first year students in large classes when students are coming in with a wide range of skills is a challenge, but by integrating active learning into large classes (CTE tip sheet – Activities for Large Classes), considering student motivation (CTE tip sheet – Motivating Our Students) and providing frequent, formative feedback to students, instructors across campus are helping to keep students engaged and successful.

Welcome first year students, and kudos to those great first year instructors who work hard to keep them here!

Kuh, G.D, Cruce, T.M., Shoup, R., Kinzie, J. & Gonyea, R.M. (2008) Unmasking the Effects of Student Engagement on First-Year College Grades and Persistence. The Journal of Higher Education, 79 (5), 540-563.

High Failure Rates in Introductory Computer Science Courses: Assessing the Learning Edge Momentum Hypothesis – John Doucette, CUT student  

valleyIntroductory computer science is hard. It’s not a course most students would take as a light elective, and failure rates are high (two large studies put the average at around 35% of students failing). Yet, at the same time, introductory computer science is apparently quite easy. At many institutions, the most common passing grade is an A. For instructors, this is a troubling state of affairs, which manifests as a bimodal grade distribution — a plot of students’ grades forms a valley rather than the usual peak of a normal distribution.

For most of the last forty years, the dominant hypothesis has been the existence of some hidden factor separating those who can learn to program computers from those who cannot. Recently this large body of work has become known as the “Programmer Gene” hypothesis, although most of the studies do not focus on actual genetic or natural advantages, so much as on demographics, prior education levels, standardized test scores, or past programming experience. Surprisingly, despite dozens of studies taking place over more than forty years, some involving simultaneous consideration of thirty or forty factors, no conclusive predictor of programming aptitude has been found, and the most prominent recent paper advancing such a test was ultimately retracted.

The failure of the “Programmer Gene” hypothesis to produce a working description of why students fail has led to the development of other explanations. One recently proposed approach is the Learning Edge Momentum (LEM) hypothesis, by Robins (2010). Robins proposes that the reason no programmer gene can be found is because the populations are identical, or nearly so. Instead of attributing the problem to the students, Robins argues that it is the content of the course that causes bimodal grade distributions to emerge, and that the content of introductory computer science classes is especially prone to such problems.

At the core of the LEM hypothesis is the idea that courses are composed of units of content, which are presented to students one after another in sequence. In some disciplines, content is only loosely related, and students who fail to learn one module can still easily understand subsequent topics. For example, a student taking an introductory history class will not have much more difficulty learning about Napoleon after failing to learn about Charlemagne. The topics are similar, but are not dependent. All topics lie close to the edge of student’s prior knowledge. In other disciplines however, early topics within a course are practically prerequisites for later topics, and the course rapidly moves away from the edges of students’ knowledge, into areas that are wholly foreign to them. The more early topics students master, the easier the later ones become. Conversely, the more early topics that students fail to acquire, the harder it is to learn later topics at all. This effect is dubbed “momentum.”

Robins argues that introductory computer science is an especially momentum-heavy area. A student who fails to learn conditionals will probably be unable to learn recursion or loops. A student who fails to grasp core concepts like functions or the idea of a program state will likely struggle for the entire course. Robins argues that success on early topics within the needed time period (before the course moves on) is largely random, and shows via simulation that, even if students all start with identical aptitude for a subject, if the momentum effect is increased enough, bimodal grade distributions will follow. However, no empirical validation of the hypothesis was provided, and no subsequent attempts at validation have been able to confirm this model. The main difficulty faced in evaluating the LEM hypothesis is that the predictions it makes are actually very similar to the “Programmer Gene” hypothesis. Both theories predict that students who do well early in a course will do well later on. The difference is the LEM hypothesis says this was mostly down to chance, while the “Programmer Gene” hypothesis says it was due to the students’ skill.

In my research project for the Certificate in University Teaching (CUT), I proposed a new method of evaluating the LEM hypothesis by examining the performance of remedial students — students who retake introductory computer science classes after failing them. The LEM hypothesis predicts that remedial classes should also have bimodal grade distributions, because student success on initial topics is largely random. Students taking the course for the second time should be just as likely to learn them as students taking the course the first time round. In contrast, the “Programmer Gene” hypothesis predicts that remedial courses should have normally distributed grades, with a low mean. This is because remedial students lack the supposed “gene”, and so will not be able to learn topics much more effectively the second time than they were the first time.

To evaluate this hypothesis, I acquired anonymized data from four offerings of an introductory computer science course: two with a high proportion of remedial students, and two with a very low proportion. I found weak evidence in support of the LEM hypothesis, as all grade distributions were bimodal when withdrawing students were counted as failing. However, when withdrawing students were removed entirely, only one non-remedial offering was bimodal, a result predicted by neither theory.

Although my empirical results were ultimately inconclusive, my research provides a clear way forward in evaluating different hypotheses for high failure rates in introductory computer science. A follow up study, conducted with data from a university that offers only remedial sections in the spring term (removing the confounding effects of out-of-stream students in the same class) may be able to put the question to rest for good, and facilitate the design of future curricula.

References:

Robins, A. (2010). Learning edge momentum: A new account of outcomes in CS1. Computer Science Education, 20 (1), 37-71.

The author of this blog post, John Doucette, recently completed CTE’s Certificate in University Teaching (CUT) program. He is currently a Doctoral Candidate in the Cheriton School of Computer Science.

Peer Grading in Higher Education – Hadi Hosseini

phd020608s

Most educators regard peer assessment and peer grading as a powerful pedagogical tool to engage students in the process of evaluating and grading their peers while saving instructors’ time. This process helps improve students’ understanding of the subject matter and provides an opportunity for deeper reflection on the subject matter, accessing higher levels of Bloom’s thinking taxonomy.

Designing and distributing tasks and assignments for peer assessment should be as easy as assigning a few papers to each student and wait for the magic to happen, right? …. Not really!

As instructors, we care about the fairness of our evaluation methods and providing effective feedback. Yet, throwing this crucial responsibility to the shoulders of novice students who (hopefully) have just learned the new topic seems like an awfully risky behavior. There are two major concerns when it comes to peer grading; inflated (or deflated) grades and poor quality feedback. Both of these issues seem to be originating from the same sources of insincerity of the graders and the lack of effort each student invests on grading the peers [2]. To address these issues, recently researchers have raised the question of whether we can design a peer-grading mechanism that incentivizes sincere grading and discourages any type of secret student collusion.

The simplest possible design is to evaluate the quality of the peer reviews or simply put by “reviewing the reviews” [1, 3, 4, 5]. This procedure is bullet proof since no student can get away from a poor-quality feedback or deliberately assigning insincere grades. However, even though this technique may help us achieve the goal of involving the students in the higher levels of learning, in most situations this mechanism is either costly in terms of TA-Instructor time or simply impossible in large classes. In fact, this fully supervised approach defeats one of the main purposes of using peer grading by doubling or tripling the required grading effort: each marked assignment has to be reviewed by one or two TAs for quality.

As a partially automated solution, the system may randomly send a subset of graded papers to the Teaching Assistants (TAs) to perform a sanity check (instead of doing this for every single paper). In contrast, fully automated systems provide a meta-review procedure in which students evaluate the reviews by rating the feedback they have received [1, 3, 5] or by computing a consensus grade for assignments that are initially graded by at least two or three peer graders [5, 6].

In a different approach, students are treated as potential graders throughout the term and only those who pass certain criteria will be take the role of independent graders [6]. The premise is that once an individual reaches a level of understanding, he or she can now act as a pseudo-expert and participate in the assessment procedure. Of course, to ensure fair grading the system randomly chooses a subset of graded papers to be reviewed by the instructor.

Peer assessment is still in its infancy; nevertheless a number of researchers in various disciplines are developing new techniques to address the critical issues of efficiency, fairness, and incentives. Each of the above methods (and many others that exist in the peer-grading literature) could potentially be adopted depending on course characteristics and intended outcomes. I do believe that such characteristic, to very least, must include the following:

  • Skill/knowledge transferability: Do marking skills and the knowledge of a previous topic  automatically transfer to the next topic? If so, are they sufficient?
    For example, an essay-based course may contain similar marking guidelines in all its assignments and training students once could be sufficient in transitioning students to effective peer graders.
  • Course material and structure: How are the topics that are covered in the course dependent on one another? Is the course introducing various semi-independent topics, or are the topics all contribute to building a single overarching subject.

What do you think? Have you ever used peer-assessment in your classes?

 

References

  1. Cho, K., & Schunn, C. D. (2007). Scaffolded writing and rewriting in the discipline: A web-based reciprocal peer review system.Computers & Education48(3), 409-426.
  2. Carbonara, A., Datta, A., Sinha, A., & Zick, Y. (2015) Incentivizing Peer Grading in MOOCS: an Audit Game Approach, IJCAI.
  3. Gehringer, E. F. (2001). Electronic peer review and peer grading in computer-science courses.ACM SIGCSE Bulletin33(1), 139-143.
  4. Paré, D. E., & Joordens, S. (2008). Peering into large lectures: examining peer and expert mark agreement using peerScholar, an online peer assessment tool.Journal of Computer Assisted Learning,24(6), 526-540.
  5. Robinson, R. (2001). Calibrated Peer Review™: an application to increase student reading & writing skills.The American Biology Teacher63(7), 474-480.
  6. Wright, J. R., Thornton, C., & Leyton-Brown, K. (2015). Mechanical ta: Partially automated high-stakes peer grading. In Proceedings of the 46th ACM Technical Symposium on Computer Science Education (pp. 96-101). ACM.