What should we be teaching CS students about collaboration?

If you read almost any academic blog, or Rate Your Students, or really any site that academics frequent, you'll encounter discussion, debate, and general bitching about students' lack of ability to (a) properly cite sources and (b) avoid plagiarism. Discussions with my academic friends in more writing-intensive disciplines bear out what cyberspace illustrates: students don't, or can't, or won't, properly cite sources.

This carries over into computer science, too, though, and that's something that's been on my mind lately: How do we teach students how to properly "cite", and avoid plagiarism, within computer science? And how do we reconcile this with having students learn to work together, in teams?

As computer science educators, we want our students to be able to demonstrate that they, independently, can design and implement an algorithm, apply and modify a solution, write and debug code. At the same time, we need to equip them with the tools they need to survive in the "real world": learning to program in teams, learning to ask for help, learning how to evaluate experts and sources of information. As an educator, I can see how these two concepts can peacefully coexist: you can have students program in teams, for instance, on larger assignments, and use tests or smaller assignments to evaluate students individually. In the best case scenarios, this works well. Best case, here, means that students know what constitutes "appropriate" levels of collaboration when working in teams and as individuals (i.e., how much "help" am I allowed to receive from a classmate when working on an individual assignment?).

I'm increasingly finding that students have a hard time distinguishing the line between acceptable collaboration and unacceptable collaboration....even when I feel I've defined that line fairly explicitly.

For instance, I often have my students work on assignments in teams of 2-3 (an arrangement called pair programming). With pair programming, there are strict rules in terms of how much time a student can spend at the keyboard, what the person not typing should be doing (debugging, suggesting lines of code/algorithmic tweaks), and on how work should be done (always together, never separately). This works pretty well most of the time: students seem to understand and adapt to the rules very well, and even the skeptics come to appreciate and enjoy the experience.

However, once the students move to individual assignments, some of them seem to really have problems figuring out the boundaries. Can they ask their old partner for help? Can they "share" code that they got working with a friend who's struggling? Should the friend, or the code-giver, mention this? Again, I feel like I'm pretty explicit as to what's acceptable (consulting with a classmate over a specific problem in one's code, or in formulating the algorithm) and what's not ("I'll do Part A and you do Part B and we'll combine them after the fact"), and I even give instructions and guidance in terms of how and when to "cite" your sources in your programs. Yet in almost every class, I have at least one student who skirts, or crosses over, that line. And in some of these cases, I do believe the student when s/he says that s/he wasn't aware of crossing that line.

Clearly, banning all collaboration and consulting of outside sources is not a tenable solution, not to mention impractical and stupid. And clearly, it's impossible to lay out all possible scenarios for the students in advance. (If there's anything I learned on this job, it's that students will interpret things in ways that I never intended and could have never anticipated.) So where's the middle ground? What expectations should computer science educators hold for their students in terms of knowing/reasoning about acceptable collaborative behavior? Should we be more explicit in our expectations? And if so, how explicit should we be, without veering into excessive hand-holding?

Just some food for thought today.

More like this

This is definite food for thought, and comes up in chemistry with lab work (and computer work), too. One idea (which I'll admit I've not used, but have colleagues who have) is to ask students to write for each exercise a one paragraph description of how they approached the project, including their collaborations. It's not so much a policing exercise, as it is a way to have a set of conversations with students about the ways we can and should work with each AND appropriately acknowledge each other's work.

Thanks for sharing the "pair programming" guidelines...I have students code in my physical chemistry course and have been using a variant of these - but these are much better! (and I promise to acknowledge my sources ;-) ).

As a student, I believe that a large portion of the citation issue is that we never get a really straightforward guide to making citation. I swear that when I wrote History papers I would spend as much time trying to figure out just how to actually cite something as reading primary and secondary sources (there's at least 5 styles, some very different and some that look the same except for comma placement!) The only style I could consistently work with was MLA, since citations were easy and done in the text, without spending twenty minutes to format a footnote or endnote that my computer would likely whine about or that would cause formatting issues, etc. etc.

That said, I (and others I've met) have no issue "citing" things in code or when writing mathematics proofs. Case in point, while struggling with one proof in a set theory course, a friend of mine had discovered a very elegant method, and so I used that and explicitly wrote out that I had taken his method. There were no problems with that. It also didn't require esoteric formatting.

The usual solution for CS courses here seems to be only attacking obviously plagiarized code, and letting exams be the "sink or swim" factor for individual work. Not sure this is necessarily a "good" approach however. The department here is generally accepted to be rather incompetent as far as teaching goes.

By Numerical Thief (not verified) on 29 Apr 2008 #permalink

I have used a method very successfully that I learned from Lynn Andrea Stein, professor for CS at Olin College.

She focuses on the *process* of getting to the result, not on the result itself. Students submit reports (including time needed and reflections on the process) that describe what happened: I borrowed this bit from Mike and that bit from Anna and George and Sylvia and I spoke about this and that. Then I did this tiny delta, and guess what, it worked!

I don't focus on how many whistles and bells they implement, but how far they got and on the process description. It is very difficult (but not impossible, it has happened but was very easy to detect) to plagiarize a process lab report. And they are learning to give credit (who cares about footnote format, the important thing is that one is giving credit!). And learning each other's names.

The guys who have learned programming before bitch a lot about this (CS people don't like having to write complete sentences). But when they come back from industry for a visit they thank me for having taught them how to collaborate and how to write, which is very nice. And they make a lot of effort in their final thesis to quote everything correctly - pictures, code, text.

Of course, it may help that I do a lot of research in the area of plagiarism detection :), but I just caught one (a frosher) last week, so it is not a cure-all.

By Debora Weber-Wulff (not verified) on 29 Apr 2008 #permalink

This is an interesting question for me, for I believe that it's impossible to know everything and that we should be teaching students how to look things up and how to find answers to problems. Yes, students need to have some basic, fundamental skills, but the truth is that they will probably never have to do some of the things that we require in class such as implementing a doubly linked list. They will, however, have to know how to find the code of someone who has implemented it and then properly use it (with citations I guess). So, it's a challenge to figure out just how to structure assignments and how to teach students the way to properly use someone else's information.

By NewGradStudent (not verified) on 29 Apr 2008 #permalink

Debora,

They plagiarize in your classes?!?! Since the fact that you are an expert in this regard is not exactly a secret, these must really be the dumbest cheaters ever!

I like the idea of a report or paragraph describing the solution process---thanks for the idea, Michelle and Debora!

I also agree with and understand Numerical Thief's point about not knowing how/when to cite being part of the problem---I'm sure some (most? all?) of my students don't want to appear "dumb" for citing the "wrong" way or citing something they don't have to cite. I do provide a few examples on my syllabus of how to cite, and a few situations in which citing is appropriate. Perhaps I need to add language that explicitly says "when in doubt, cite!" Honestly, though, I don't care what format it takes, as long as it's done somehow.

NGS, learning how to reuse code---both writing code that can be used by others, and finding/using code written by others, in appropriate contexts---is an important part of CS. In the lower-level (intro and required-for-the-major) courses, the balance is definitely skewed towards "write your own, but write it in a reusable way" (to the extent that they can). Although I do often give my lower-level students "starter" code so that they do get in the habit early of using/modifying others' code. I suspect we/I don't do a great job of talking about code reuse in the upper-level courses, and/or giving students good practice scenarios and situations around that.