ASET-HERDSA 2000: Freeman and McKenzie - Self and peer assessment of student teamwork

[ Proceedings ] [ Abstracts ] [ ASET-HERDSA 2000 Main ]

Self and peer assessment of student teamwork: Designing, implementing and evaluating SPARK, a confidential, web based system

Mark Freeman
Faculty of Business, University of Technology, Sydney
Jo McKenzie
Centre for Learning and Teaching, University of Technology, Sydney

Students often enjoy learning in teams and developing teamwork skills, but criticise team assessment as unfair if team members are equally rewarded for unequal contributions. This paper describes the design, implementation and evaluation of a confidential, web based system for self and peer assessment of contributions to team tasks and team management roles, which enables shared team marks to be moderated to reflect individual contributions. The web based approach has several advantages compared with paper based approaches. For students, it enables the use of multiple assessment criteria which better reflect team contributions, improves familiarity with the assessment criteria and improves confidentiality as they can access the system and change their ratings as often as they wish until a pre-determined cutoff date. For staff, it has potential to improve student learning from teamwork tasks, and saves time by automating the process of calculating self and peer adjustments of assessment grades, enabling the system to be used in subjects with large enrolments.
In 1999, the system was trialed and evaluated in several different subjects with different kinds of assessment tasks. Evaluation findings suggested that most students appreciated the confidentiality of the system and felt that the system was a fair way of assessing team contributions. However there was considerable variation across the different subjects. This paper presents four brief case studies which describe the different uses of the system, the responses of staff and students and the lessons learned. The differences point to the potential usefulness of the system. More critically however, they provide evidence for the range of factors that teachers need to consider when planning to use SPARK and integrating it successfully into the subjects.

Introduction

Many courses aim to develop students' ability to work as part of a team and include team assessment tasks such as presentations, projects, case studies, reports, debates and so on. Students often enjoy learning in teams and developing teamwork skills, but criticise team assessment as unfair if team members are equally rewarded for unequal contributions. This paper describes the design, implementation and evaluation of the Self and Peer Assessment Resource Kit (SPARK), a web based system which aims to improve learning from team assessment tasks and make the assessment fairer for students. SPARK enables students to rate confidentially their own and their peers' contributions on a range of criteria related to team tasks and maintenance, so that shared team marks can be adjusted to acknowledge individuals' contributions. If it is thoughtfully implemented in a subject, SPARK can also encourage students to learn more about effective teamwork. The aims of this paper are to illustrate the potential benefits of the system, but also to share the lessons learned with teachers seeking to implement similar approaches. Our motivation is that colleagues designing learning environments incorporating teamwork will be encouraged to use self and peer assessment of team contributions. We wish to encourage any method, including SPARK, which encourages students to develop their capacity to work in a team and engage productively in teamwork tasks.

Group and team work are commonly used in higher education to facilitate peer learning and encourage students to develop their capacity to work as part of a team. There seems little argument about the value of teamwork, but its assessment has proved considerably more problematic (Conway, Kember, Sivan & Wu, 1993; Lejk, Wyvill & Farrow, 1996). One author has likened group assessment to a game, maintaining that the rules of the game advantage some students and disadvantage others, and that factors such as teamwork and contribution to a team are "essentially impossible to assess fairly" (Pitt, 2000, p. 240). However, assessment strongly influences students' learning (Ramsden, 1992; Biggs, 1999). If our courses include objectives about students' capacity to work as part of a team, and we value peer learning and collaboration then we need some means of assessing teamwork in a fair and meaningful way which promotes peer collaboration (Sampson, Cohen, Boud and Anderson, 1999).

Peer and self assessment should be a justifiable way of assessing team contributions, as it gives team members the responsibility for negotiating and managing the balance of contributions and then assessing whether the balance has been achieved. Peer assessment of individuals' contributions to assessed teamwork isn't a new idea, although the addition of self assessment is relatively innovative. While there is some debate about the inclusion of self assessment (Lejk et al 1996), we believe it encourages students to reflect on their own contributions and capabilities. In fact, Boud, Cohen and Sampson (1999) favour self assessment informed by peer feedback on specific criteria, in preference to peer assessment per se.

The SPARK system uses both self and peer assessment and was adapted from a well-designed and evaluated paper based peer assessment system in which students rated each other's contributions and the lecturer used the ratings to calculate adjustments to individual marks (Goldfinch 1994). Goldfinch's approach uses multiple assessment criteria to encourage students to consider a range of different components of the task and team process in making their assessments. The approach was a simplification of an earlier approach taken by Goldfinch and Raeside (1990) which involved a two part assessment form where students were prompted to identify peers who made the greatest contribution to particular task elements, then used these promptings to give peer ratings. A related, simplified approach was used by Conway et al (1993) and found to be reasonably well accepted and regarded as fair by their students. Cheng and Warren (2000) also used peer assessment against multiple criteria to moderate group marks, noting that the approach "facilitates the benefits of group work while providing opportunities for peer assessment" (p. 253).

While these related methods have all been reasonably effective in adjusting team marks to reflect individual contributions, they have all been paper based and involved a series of time consuming calculations to generate adjustment factors. This creates a disincentive for lecturers and delays the provision of feedback to students. The Goldfinch (1994) and Conway et al (1993) simplifications of the original Goldfinch and Raeside (1990) scheme were attempts to reduce this problem, but proved unmanageable with class sizes in excess of 500 students. The SPARK system deals with this problem by automating the processes of collecting the student assessments and completing the calculations. This is a major efficiency benefit, but efficiency was not the only rationale for designing the system.

SPARK also has the aim of improving students learning from team tasks in several ways. Firstly, provided valid assessment criteria are chosen, the system should encourage students to negotiate the way they will work in the team to achieve the best task result with equal contributions by all students. Secondly, using self and peer assessment encourages students to develop the capacity to reflect on and evaluate their own and others' contributions and develop awareness of their own strengths and needs as a team member. Another major benefit of SPARK is that it is a relatively generic template which can be easily adapted to any learning context where group or team work and/or self and peer assessment, are used.

The next section of this paper describes the SPARK system and its typical implementation in a subject. This is followed by a description and analysis of the iterative process of designing, developing, implementing and evaluating SPARK, first in one subject and then in three further subjects in very different contexts. The final sections discuss the lessons learned and implications of future use for SPARK and the wider implications of the process for project developers who seek to disseminate generic templates across teachers, subjects, disciplines and institutions.

Description of SPARK and implementation in a typical context

SPARK is intended for use in subjects where: the objectives include developing students' capacity to work as part of a team and to reflect on their own teamwork skills, there is assessed teamwork and other forms of web enabled learning are embedded in the subject. The system contains separate interfaces for "instructors" and students, both of which include help information and frequently asked questions accessible from the login screen. Figure 1 shows the login screen for the instructor system.

Figure 1: Staff/Instructor login screen

Figure 2 shows a typical screen within the system, showing the menu bar in the top panel and help information in the left hand panel.

Figure 2: SPARK screen for instructors to enter or importing student details

Teachers begin using SPARK by entering subject and student details. Student details can be batch imported. Assessment criteria need to be decided, either by the teacher or the teacher in negotiation with students. Teachers need to decide which criteria will be prompting only, and which will contribute to the final assessment. The next, and very important stage, is helping students to understand how group assessment aligns to the subject objectives, the rationale for using SPARK, how they can use it to assist their teamwork and how the self and peer assessment will affect their marks. How groups are formed and facilitated are important issues but not dealt with in this paper. If teams are formed by the academic then they can be batch imported. Otherwise the students can register their own teams. Students can modify self chosen team memberships until a close off date. (SPARK does not allow any overlap between the team registration period and the subsequent rating period.)

Once students and teams are defined and assessment criteria are chosen, students can access the system as often as they wish to view and discuss the assessment criteria with their team members. After the group task is over, a defined rating period allows students to confidentially rate each member of their group. Figure 3 shows an excerpt from a typical self and peer assessment form, showing some sample criteria and ratings where 1=no contribution to the team for that aspect, 1=below average, 2=average and 3=above average contribution.

After students have done their assessment, teachers can then use the system to calculate various self and peer assessment factors, based on the formulae published in Goldfinch (1994). These factors can be exported to a spreadsheet for calculating individual marks (if the purpose is summative assessment) or used as a source of feedback to students (if the purpose is formative assessment).

Figure 3: Excerpt from a typical SPARK self and peer assessment form

While the front end which students and staff see is written in HTML for the web (and not some proprietary network or program), the back end database and operating system and programming approach have experienced multiple changes as technologies have advanced in the last 5 years. The current version runs on a Windows NT server with an ACCESS database. In 1998, queries to the ACCESS database were written in Java and Java script. More recent programming has been aided through the use of ServletExec servlets. The system is still being refined.

Case studies of development, implementation and evaluation

Development of the current generic system has involved an iterative process of developing, trialing, evaluating and refining SPARK over the past eighteen months, based on prototype development over several years. Funding for the development of the generic version was provided through the Committee for University Teaching and Staff Development, an Australian government funding body. The development team comprised academics from five discipline areas across two universities, a programmer and academics from specialist academic development units. One of the disciplinary academics has particular expertise in graphic and interface design. Ongoing quantitative and qualitative feedback from students, academic users, experts and participants in workshop seminars have resulted in continuous improvements to the program.

The team have utilised a web supported conferencing program for project management, asynchronous discussion of issues, brainstorming and as a repository of key materials. Educational specialists in self and peer assessment have been consulted for expert reference during the development period at conferences, by visits and by email. Although students are central to any educational context, our evaluation attempted to consider other stakeholders such as the effects on staff, academic departments, the institution, and the wider context. This holistic approach reflects our perspective that unless all parties and issues are considered, a disjointed impression can be given. Details of the findings and broader issues will be published in a later paper. Issues of particular relevance to students and academics are woven through the following four cases.

Subject A: Large first year subject where the prototype was developed and evaluated

Development of SPARK prototypes began in March 1996 with the aim of improving learning from teamwork assessment, reducing student complaints about free riding and improving administrative efficiency in a first year subject with a very large enrolment (Subject A). Assessable group work was introduced in the subject in 1992, and involved a case study task worth 30%. The assessment task reflected a real world problem that students would face on graduation. The case study was submitted in three parts over the semester, with feedback on each part being built into the next stage. End of semester feedback from students indicated satisfaction with the nature of the case study but substantial dissatisfaction with group assessment, due to perceptions of unequal contributions. From 1993 to 1995, various approaches were used to adjust team marks to individual marks. These included student activity diaries, one line summaries of self and peer assessment and asking groups to divide up the mark. Each of these yielded less than satisfactory results.

By 1996, the students enrolled in the subject had risen to 850 and a series of "homebrew" web pages and simple applets was used to facilitate staff-student and student-student interaction and provide access to additional materials. The first prototype of SPARK was developed and trialed in the web environment. Assessment criteria were developed using a focus group of previous students, which identified 16 sub-tasks involved in the completion of the case study. These task criteria were supplemented by a further six criteria related to team maintenance and leadership roles. Students registered their team and had access to the criteria before the case study began then, following submission of the third stage of the case study, they submitted ratings of their own and their peers' contributions.

A number of benefits for students arose from using SPARK. Students perceived the process was fairer because the rating items reflected a range of aspects of the team task and maintenance roles, and the self and peer assessment could be done confidentially and changed as many times as they wanted until the deadline. The latter enabled students to change their ratings privately if others had publicly coerced them. Students also used the obvious nature of the items and method of calculation to manage team effort during the process and even affect their choice of group membership. Open ended responses on the end of semester student feedback survey showed a dramatic reduction in complaints about the team assessment.

The main relevant cost to students was that of access to SPARK. Less than 10% had external access to a web connected PC and students were heavily competing for lab facilities around the time of the deadline. This problem has largely disappeared as external web access amongst these students is now closer to 90%. The other access problem arose because of occasional bugs in the software. When it comes to assessment, even occasional bugs can be very frustrating and stressful.

Staff experienced a number of benefits from incorporating SPARK. Firstly, it was possible to retain the case study as a valuable learning task. Without a solution to the free-rider complaint, it was likely that the task would have been dropped. Secondly, staff felt satisfied that the process of assessment was fair. This was ethically important as well as increasing student satisfaction. Student comment about the case study being valuable for learning in the subject re-emerged. Thirdly, staff felt a sense of satisfaction that progress was made in a conscious attempt to develop students' ability to work in a team, a capability they knew was important in the profession. Without such easy data collection on multiple rating items, this would not have been possible. Another indicator of success was that despite the growing number of teams, the number of team problems needed staff intervention had reduced substantially.

There were several costs to staff. Firstly, having to reprogram the system for several different platforms was time consuming. Technical bugs were bound to occur in such an environment, as they do in any development mode. But students are very unforgiving when it comes to technical bugs and assessment. Because student feedback was both anonymous and of a public nature on the web discussion forum, even if only a few experienced problems, their comments could be loud and strong. While this was very discouraging during the developmental phase, it was a strong incentive to improve. Secondly, in the first three years, the calculation of the factors was very time consuming, taking up to 3 days on Excel because there could be between 250 and 300 groups. The latter was rectified in the generic version.

The lessons learned in this large class provided valuable insights into self and peer assessment and group work for the wider university community and also helped the subsequent development of the generic version.

Subject B: mid degree undergraduate with one team task

Subject B was one of the first subjects to trial the initial generic version of SPARK, while it was being developed. It is a mid-degree subject in a different discipline from subject A. Typical enrolments are around 50-80 students with day and evening classes both taught by the same lecturer. Students use a web enabled learning system within the subject to access announcements, interact with the lecturer and with each other and access a range of additional subject materials. The assessment in the subject includes a group case study.

SPARK was trialed in the subject in first semester 1999. Because of the development timeline, students first gained access to the system in the middle of the semester, at around the time they started the case study. The assessment criteria were taken directly from those used in Subject A, rather than being customised. The lecturer perceived that they were sufficiently appropriate as both assessment tasks involved a case study. Like in Subject A, students chose their own teams.

Evaluation of SPARK involved a questionnaire followed by a focus group with both the day and evening classes, and a reflective diary kept by the lecturer. The student questionnaire included rating and open ended questions. It asked questions about useability, reactions to the system, and perceptions of learning from the self and peer assessment process. Table 1 shows student responses to some of the rating questions.

Table 1: Student responses to SPARK in Subject B (n=48)

SA/Agree SD/Disagree

The system was accessible 79% 8%

The system was easy to use 70% 13%

The process helped me learn more about teamwork 40% 24%

Identified aspects of teamwork I hadn't thought about before 41% 27%

Items were appropriate for assessing contributions 69% 9%

Encouraged greater effort 40% 33%

Able to give an honest assessment 78% 11%

Fair way of assessing team contributions 69% 18%

The percentage of students who reported that the process had helped them to learn more about teamwork was encouraging, considering that most students had encountered team tasks in previous subjects. It was also interesting that 40% felt it encouraged them to make more effort whereas 33% disagreed, the latter often commenting that they were self motivated to contribute or wanted to do well and did not need the external incentive to make an effort.

Responses to the open questions and the focus group suggested that many students perceived the purpose of the system as encouraging equal contributions by team members, or controlling free-riders. While most students perceived that the system was fair, some disagreed, particularly if they had worked in groups of three rather than four. Some students clearly did not understand exactly how the self and peer assessment ratings would affect their marks. These perceptions appeared to reflect the way that the lecturer introduced and explained the system. The lecturer perceived the main benefit to be reducing free riding.

Disadvantages for both the lecturer and the students focused on the useability of the system, in particular bugs and other technical problems which happened during the development. Discussions of useability resulted in some changes to the system, including simplifying the password system to make it the same as that used in the web based learning system, and providing feedback messages to confirm that students' ratings had been submitted successfully.

Subject C: mid-degree undergraduate with two team tasks

Subject C is an intermediate stage subject in a different discipline again and is taught by a team of several lecturers. The subject was offered for the first time in the semester when SPARK was trialed and had around 200 students enrolled. Students in the subject work on two major assignments in cross-disciplinary teams. Team membership is allocated by the subject lecturers to ensure that a range of disciplinary majors is represented in each team. Students participated in team development activities in the tutorials before they commenced work on their assignments.

SPARK was introduced and used for formative feedback to teams at the end of the first team assignment and summative assessment at the end of the second task. This was an innovative use and some of the teaching team perceived that it should encourage teams to discuss the way they worked and work more effectively for the second task. The assessment criteria were the same as those used in Subject A. Students gained access to SPARK shortly before the end of the first task. Evaluation included rating and open ended questions on SPARK as part of a standard subject evaluation questionnaire, a student focus group, lecturer reflection and a focus group with the teaching team. Students were asked fewer questions than in Subject B, because the teaching team wanted to ask many questions about other aspects of the new subject. Table 2 shows some responses.

Table 2: Some student responses in Subject C (n=187)

SA/Agree SD/Disagree

Self and peer assessment feedback after assignment 1 helped the team to work more effectively on assignment 2 20% 47%

Items were appropriate for assessing contributions to the team assignments 42% 37%

Team development tutorial activities helped me learn more about teamwork 36% 35%

Introduction of SPARK in this subject had more disadvantages than benefits for students and lecturers, resulting in some valuable lessons learned for the development team. Formative use of the system created breakdowns in some teams when team members who perceived themselves to have contributed equally ended up with different peer assessment ratings. Teaching team members reported more of these team problems than they usually experienced in similar subjects. Several factors appeared to contribute to this. The assessment criteria were taken from subject A rather than specifically chosen to reflect the team tasks that students had to do. While some items on the "subject A" form might be said to be generic qualities of teamwork (see Figure 3), others were not. Only 42% of students agreed that the items were appropriate for assessing contributions, compared with 37% who disagreed. In subject B, 69% had agreed and 9% disagreed.

Further analysis of the open ended responses and discussion in the focus group yielded other reasons. Students did not fully understand how the system worked, and in particular how ratings on each of the assessment criteria would affect the overall self and peer ratings. They also felt that they had spent tutorial time in discussing how their teams would work, and their discussions were not reflected in the criteria used in SPARK. Formative feedback was given only in the form of the overall self and peer adjustment factor, rather than as a profile of contributions which could be discussed in a team.

Despite the problems however, students generally perceived that SPARK had a useful purpose if it were appropriately implemented, as illustrated in the following quotes:

"made you think about how much each member and yourself contributed to different aspects of the assignments"
"to evaluate contributions from each team member by team members to get a fair distribution of marks. It still didn't work."

Students also described difficulties with accessing SPARK, and complained about the time taken to access the web and complete the process in a subject where web enabled learning resources were not otherwise integrated. Teaching team members also complained about technical problems and difficulties in calculating the required factors for formative purposes. While some teaching team members sought to improve SPARK's use and maintain it in the subject, others sought to drop it entirely. For the development team there were some significant lessons learned, which are discussed more fully in later on in this paper.

Subject D: postgraduate with integrated flexible learning

Subject D is a postgraduate subject taught jointly by two lecturers from different faculties. About 30 students take the subject, which has been significantly adapted from the typical on campus 13 week semester mode. The 'weekly lecture and tutorial' format has been replaced by 4.5 Saturdays of face to face contact, about one day per month. In between the block sessions, students complete a range of learning activities by themselves and in groups, with most activities involving interaction using a web enabled learning system.

The subject has eight objectives - two related to knowledge development, four related to the development of capabilities for using that knowledge (eg. critically evaluate problems and alternative solutions; effectively use analytical tools; competently use technology; communicate effectively to develop and maintain personal and professional relationships) and two objectives related to values (ie. able to work self critically in a group or autonomously, respecting different cultures, ethical and disciplinary approaches). Assessment is aligned carefully with learning activities to ensure subject objectives are achieved. 50% of the grade is allocated to individual work. The remaining 50% of the grade is based on four group assessment tasks. A team presentation worth 20% and 3 team tests where the average of their best two is worth 10%, are conducted in class but require significant preparation out of class. A team debate worth 10% and a team topic tracking exercise worth 10%, are completed out of class time but submitted online.

With 50% of the grade comprising group assessment tasks, students need to seriously deal with their own and others' abilities to work in a group. Not only do groups face the typically possible side effect of free-riders, but the potential for dysfunction in groups is higher because of language and cultural differences. Up to 70-80% of the student cohort are international students, coming from a large variety of countries where English is not their native tongue. To optimise the potential benefit of working in a group, the membership is static for the duration of the semester.

Following completion of the final team assessment task, students undertook to rate each team member. In 1998 the self and peer assessment process was completed on paper at the final face to face session and then manually entered by staff into an Excel spreadsheet which calculated the self and peer adjustment factor identified by Goldfinch (1994). In 1999, SPARK was used for data entry when the students rated each other online over a one week period, and staff used it for the subsequent calculation of the self and peer assessment adjustment factors. Sixteen 'prompting' criteria were specifically chosen for the four team tasks. Students for example evaluated their own and their peer's on two aspects for the topic tracking task (ie. 'quality of postings' and 'quantity of postings'), five aspects of the debate, six for the presentation, and three for the tests. This was followed by six 'final' criteria relating to an effective team. Evaluation of the process was carried out using student questionnaires, a structured phone interview with almost all students and reflective journals kept by the two lecturers who jointly taught the subject.

In the student feedback survey, only 8% did not feel that their ability to work in a team had been enhanced through the team exercises. This is a very positive result given that most are mature learners who would have had ample opportunities to develop this skill in their work experience prior to enrolling. The phone interviews revealed that the rating items were appropriate and that most felt it was a fair and honest solution for encouraging teamwork overall. Only some 10-14% disagreed with each of these criteria. Most students thought SPARK should be implemented where ever group work is used. Interestingly, some 40% said that they did not contribute a greater effort because self and peer assessment was used. Combined with the previous data, this is a positive outcome since it means that free riding was discouraged without pressuring already committed students to do more work.

Staff found SPARK saved them considerable time previously spent on data entry and calculation. They also felt satisfied that the process had encouraged students to achieve the subject objectives, including the development of their ability to work in a team, in the more flexible learning mode.

Lessons learned and future potential

Clearly there have been different experiences for both students and staff in using SPARK in these different subjects. These differences pointed to a range of important lessons for the development team. Some are specific to SPARK and others have implications for the use of generic templates more widely. We will firstly discuss those specific to the use of SPARK.

SPARK works best when students can see the valid reasons for having a team task in the subject, and for using self and peer assessment of team contributions. It is only one approach for ascertaining students' contributions to teamwork, and, like all approaches, needs to be educationally justifiable. Teachers wanting to use SPARK need to align its use as a learning activity and assessment tool with the learning objectives for the subject. This alignment is important in any subject, as it focuses students' learning towards desired outcomes (Biggs, 1999). Teachers need to consider what they hope students will learn from teamwork tasks and represent this in their subject objectives. Criteria for the self and peer assessment processes in SPARK then need to be chosen to reflect the objectives and the task and team management contributions necessary to complete the team task. Relevant criteria are crucial to the success of SPARK, as illustrated in the differences between subject C and the others where SPARK was trialed. Criteria are important in any form of assessment, but even more so when assessment involves processes which may be unfamiliar to many students. Involving current and/or past students in selecting the criteria can greatly enhance students' perception of relevance and fairness of the self and peer assessment process.

Once criteria are decided, the teacher then needs to decide which items will be used in any calculation. If SPARK is purely used to assess teamwork processes, some criteria may simply be used to prompt students' memories of the task activities that the group undertook rather than affect a self and peer assessment adjustment factor (cf Goldfinch and Raeside, 1990). On the other hand, criteria relating to team task components may be used in the final calculations. Whatever approach is chosen, students need to be fully informed about how each of the criteria affects the self and peer ratings which are used to adjust their marks. This is a critical point for increasing students' perceptions of the fairness of the system.

As with any assessment, some students may query their result. Self and peer assessment is no different and some group members may dispute the outcomes of the self and peer assessment process. The latter can be minimised by clear articulation of the process before the group task begins. Examples demonstrating the range of outcomes arising from different levels of contribution and ratings can be very illuminating for students, and are incorporated in the student interface. Other preventative measures or resolution mechanisms can help. For example, students can be required to keep an individual and/or group diary of effort and events. This would be the first resource in the event of a dispute. The message from this is that SPARK is not a hands off tool that the teacher put into the subject to manage teamwork. Teachers are still required to think critically about its usefulness, make the process as transparent and open as possible for students and maintain hands on processes for communicating with teams and resolving conflicts.

The above issues point to the need for teachers to think carefully about how SPARK is integrated into the assessment for the subject and how it is made clear for students. Another set of lessons relates to the fact that SPARK is a web enabled template. In subject C, SPARK was the main, if not the only, reason why students needed to access the web in the subject, whereas in subject D many of the students' learning and assessment activities took place in a web enabled learning environment. If SPARK is the only subject activity which requires access to the web, students tend to regard it as an add on and see access as much more of a problem. Our recommendation is that SPARK only be used in subjects where web enabled learning is already an integrated part of the learning environment.

A further major point relates to the context of trialing and evaluating a system while it is still in development. This has some major benefits for progressively improving the system, but also some disadvantages if development work does not keep to a planned timeframe. It is critical for any assessment related system to be accessible to students as early as possible in the semester and to remain accessible and easy to use throughout the assessment process. Downtime and bugs in SPARK were frustrating for students and stressful for staff if they were unable to gain instant solutions. With a small project under development and one programmer providing technical support it was not possible to provide 24 hours a day, seven days a week support for students and staff. This is an increasing expectation when systems are available via the web. Students and staff need to be aware of this, and staff need to have clear alternatives available if students find that they cannot gain access to systems at critical periods during the assessment.

In summary, the following factors were identified as characteristic of subject environments where SPARK was more successfully implemented:

Assessment criteria were designed specifically for the specific team task. Task and team management roles were identified, preferably in negotiation with students;
The system and assessment criteria were available from the beginning of the team task and could be accessed by students as often as desired;
Staff were convinced of SPARK's usefulness;
Staff helped students to gain a clear understanding of how SPARK worked, why it was introduced, how it could be used and the effects it would have on marks;
Students regularly used other web resources for learning in the subject and doing team tasks, so that web access was not just required for SPARK;
The system was reliable and accessible throughout the semester;
Staff felt well supported at all levels by their colleagues and academic departments and by technical support personnel.

All of these points are applicable to the design and implementation of any educational technology tool aimed at improving either learning and efficiency. (Alexander and McKenzie, 1998). Most, such as appropriate criteria and clear communication with students, simply relate to overall good teaching practice (Ramsden, 1992; Biggs, 1999).

Future directions for the development and use of SPARK include developing ways of providing students with formative profiles of their self assessments and the combined peer assessments against each of the individual criteria. This may enable students to see the differences between their own and their peers' perceptions of their contributions, and discuss these in their teams. This would be considerably more informative and hopefully more constructive than the approach used in subject C of giving students the numbers only. A second potential direction is in the use of SPARK for self and peer assessment using specified task criteria of tasks which are not initially done in teams. This use has not yet been trialed, and careful thinking will need to be done before moving in this direction.

We believe that SPARK has considerable potential as a "generic" template for improving group based assessment and students' capacity to work as part of a team, and has possible uses in other areas. However the above points need to be addressed in all contexts where it is implemented, and teachers who do implement it need to continue to evaluate its benefits and downsides.

Implications for the development and implementation of generic templates for technology enabled learning

The process we used in implementing and evaluating SPARK in a series of subjects also yielded some important insights for those involved in developing other small scale generic tools. We learned that there is considerable value in implementing and evaluating across multiple subjects and disciplines and with multiple teachers. This served to raise awareness of some taken for granted assumptions made in the innovation context, particularly those which relate to the intentions of SPARK, its integration into a subject and communication with students.

Our preliminary recommendations for those involved in developing and implementing a "generic" template in their subject include the following:

Staff need to understand and value the learning principles underpinning the template design, and communicate this to students. In the case of SPARK, this meant seeing the template as a tool which could be used to help students to learn about successful teamwork in the subject, rather than merely seeing it as an efficiency device or way of controlling team free-riders. We know that there is variation in the conceptions of teachers that university teachers are aware of in relation to particular teaching contexts (Prosser and Trigwell, 1999; Prosser, Trigwell, and Taylor, 1994; Samuelowicz and Bain, 1992). A template which is designed by innovators using complex, student focused conceptions may not be implemented effectively by teachers working with teacher focused conceptions in their subjects.
Clearly the best approach to this issue would be to encourage teachers to adopt student focused conceptions and approaches to implementation, and this would be our advice to staff developers seeking to assist with implementation, but the learning design of the template also needs to be clear rather than hidden. Template designers therefore should make the learning intentions of the template very explicit, and make explicit links between these intentions and useful strategies for implementation.
Staff need to see the differences between their own context of use and the template design context, to enable them to see whether the template could be appropriate and how it might need to be adapted. This suggests the desirability of accompanying templates with case studies highlighting the critical integration features which teachers could compare with their own context.
Staff need to adapt the template to integrate it into the learning context in their own subjects - templates are rarely likely to be so generic that they can simply be picked up and used, and the way that they are integrated is critical for student acceptance and learning. Staff need to consider all aspects of the learning context, including subject objectives, assessment, learning resources, students' prior experiences and expectations and the expectations of other staff teaching in the subject.

By providing a range of examples of implementation and evaluation, both successful and problematic, we hope to encourage teachers to think through the issues that they may encounter in introducing SPARK or other generic tools into the learning environments they design for students.

References

Alexander, S. and McKenzie, J. (1998). An evaluation of information technology projects for university learning. Canberra: AGPS.

Biggs, J. (1999). Teaching for quality learning at university. Buckingham: SRHE and Open University Press.

Cheng, W. and Warren, M. (2000). Making a difference: using peers to assess individual students' contributions to a group project. Teaching in Higher Education, 5(2).

Conway, R., Kember, D., Sivan, A. and Wu, M. (1993). Peer assessment of an individual's contribution to a group work project. Assessment and Evaluation in Higher Education, 18(1), 45-56.

Freeman, M. (1995). Peer assessment by groups of group work. Assessment and Evaluation in Higher Education, 20(3), 289-300.

Goldfinch, J. & Raeside, R. (1990). Development of a peer assessment technique for obtaining individual marks on a group project. Assessment and Evaluation in Higher Education, 15(3), 21-31.

Goldfinch, J. (1994). Further developments in peer assessment of group projects. Assessment and Evaluation in Higher Education, 19(1), 29-35.

Lejk, M., Wyvill, M. and Farrow, S. (1996). A survey of methods of deriving individual grades from group assessments. Assessment and Evaluation in Higher Education, 21(3), 267-280.

Pitt, M. J. (2000). The application of games theory to group project assessment. Teaching in Higher Education, 5(2), 233-241.

Prosser, M. and Trigwell, K. (1999). Understanding learning and teaching: The Experience in Higher Education. Buckingham: SRHE and Open University Press.

Prosser, M., Trigwell, K. and Taylor, P. (1994). A phenomenographic study of academics' conceptions of science learning and teaching. Learning and Instruction, 4, 217-231.

Ramsden, P. (1992). Learning to Teach in Higher Education. USA: Routledge.

Sampson, J. Cohen, R., Boud, D. and Anderson, G. (1999). Reciprocal peer learning: A guide for staff and students. Sydney: University of Technology, Sydney.

Samuelowicz, K. and Bain, J. D. (1992). Conceptions of teaching held by academic teachers. Higher Education, 24, 93-112.

Authors: Mark Freeman, Faculty of Business, University of Technology, Sydney
Mark.Freeman@uts.edu.au

Jo McKenzie, Centre for Learning and Teaching, University of Technology, Sydney.
Jo.McKenzie@uts.edu.au

Please cite as: Freeman, M. and McKenzie, J. (2000). Self and peer assessment of student teamwork: Designing, implementing and evaluating SPARK, a confidential, web based system. In Flexible Learning for a Flexible Society, Proceedings of ASET-HERDSA 2000 Conference. Toowoomba, Qld, 2-5 July. ASET and HERDSA. http://www.aset.org.au/confs/aset-herdsa2000/procs/freeman.html

[ Proceedings ] [ Abstracts ] [ Program ] [ ASET-HERDSA 2000 Main ]
Created 20 July 2000. Last revised: 29 Mar 2003. HTML: Roger Atkinson
This URL: http://www.aset.org.au/confs/aset-herdsa2000/procs/freeman.html
Previous URL 20 Jul 2000 to 30 Sep 2002: http://cleo.murdoch.edu.au/gen/aset/confs/aset-herdsa2000/procs/freeman.html

	SA/Agree	SD/Disagree
The system was accessible	79%	8%
The system was easy to use	70%	13%
The process helped me learn more about teamwork	40%	24%
Identified aspects of teamwork I hadn't thought about before	41%	27%
Items were appropriate for assessing contributions	69%	9%
Encouraged greater effort	40%	33%
Able to give an honest assessment	78%	11%
Fair way of assessing team contributions	69%	18%

	SA/Agree	SD/Disagree
Self and peer assessment feedback after assignment 1 helped the team to work more effectively on assignment 2	20%	47%
Items were appropriate for assessing contributions to the team assignments	42%	37%
Team development tutorial activities helped me learn more about teamwork	36%	35%