This paper reports a study of the outcomes of students' work in self and peer marking of an examination in mechanical engineering design at the University of New South Wales. In addition to teachers' marking, students marked their own script, and another randomly assigned script, using a detailed marking protocol.Data are presented on the accuracy and reliability of students' assessments; and on survey information concerning their learning experiences from the exercise. These data are analysed and discussed in relation to the learning benefits which accrue, and to the aim of developing skills of self evaluation relevant to professional practice.
In engineering and other fields of professional training there has been a concern for developing students' ability to evaluate their own work in ways which are applicable to their future professional work in the discipline. According to Boud & Lublin (1983):
One of the most important processes that can occur in undergraduate education is the growth in students of the ability to be realistic judges of their own performance and the ability to monitor their own learning... If students are to be able to continue learning effectively after graduation and make a significant contribution to their own professional work, they must develop these skills of appraising their own achievements and that the foundation for this should occur at the undergraduate level, if not earlier (p.3).This belief in the importance of the developing skills of self evaluation during undergraduate education is also shared by graduates. In a survey of 1842 graduates from the University of New South Wales (Midgley & Petty, 1983), respondents were asked to rate the importance of the acquisition of different skills as part of undergraduate study, and the extent to which their university education contributed to this. From a list of nine different skills, evaluating one's own work' was listed as the second most important skill (after problem solving). However, only 20% thought that their course of study had made a 'considerable contribution', and a further 53% claimed 'some contribution' had been made to the acquisition of this skill.
It is our view that the ability to assess one's own work assumes particular importance for the specific task of developing student competence in engineering design work. This view is taken as a result of considering the complexities inherent in making judgements on the comparative worth of different solutions to a specific design problem.
At a basic level, design solutions can usually be checked through ensuring that supporting calculations are free from error, and through ascertaining that the design meets (or at least takes adequate account of) specified criteria. However, where judgements are required on the comparative worth of different approaches to solution, and on the different designs actually produced, the assessment task needs to encompass more complex elements such as creativity, economy and utility.
Although it may be argued that skill in design appraisal develops over time through professional practice, it is our belief that, within undergraduate study, opportunities should be provided for students to commence developing these skills of self appraisal. In particular, they should have opportunity to critically assess their own design work and that of their peers; and become conversant with and be able to apply the methods and standards of assessment employed by their instructors.
A further consideration arises through academics' concerns about assessment. Orpen (1982) found that (with the exception of tenure and salary) university lecturers indicated more concern about marking and grading than about any other aspect of their job. Their concerns related to doubts about the reliability of assessments; the inordinate amount of time spent on assessment tasks; and the belief that typically students learnt little from their marks, other than how they stood in relation to their peers.
The present study is based on the experiences of students who sat the 1987 Session 1 examination in Mechanical Engineering Design 2. Of the 140 who sat the exam, 87 participated in the self and peer marking exercise during tutorial classes. Self assessment methods have been employed in this subject for several years and reported earlier (Boud et al, 1986), but the procedures used in 1986 provided a substantial departure from previous years. In previous years students were required to undertake a design project for the duration of Session 2 (e.g. design of a machine assembly). This was then submitted for assessment at the end of session. In addition, students were given the tasks of using a protocol which listed a set of factors (with mark weightings) which provided criteria for marking their individual projects; and of making judgements on the extent to which their design had successfully applied each of these factors.
The assessment received for the subject embodied the mark determined by the lecturer marking the projects, and an additional 4% for completion of the student's self assessment protocol. Details of the outcomes of three years experience with this procedure are contained in Boud et al (1986).
In 1987 there were several major departures. First, in addition to marking their own papers, peer marking was introduced. Second, these marking procedures were made with respect to their formal Session 1 examination scripts in the subject, and not on a project as in previous years. Finally, the marking scheme differed in character from earlier years. Previously, the variety of approaches used in projects necessitated a marking protocol which was essentially qualitative in form although detailed in description.
For 1987, the marking schedule applied to the examination scripts was able to include details of specific steps of analysis and design details required for resolution of the design problem.
Instructions were given for applying the marking schedule: for the calculations section, ten elements were identified, each assigned a maximum of 4 marks; and for the drawing section, 9 elements of varying marks totalling 60. In a number of these elements student markers were required to exercise a considerable degree of judgement. For example, design proposals which incorporated different shapes required different calculations to determine strength; also the location of critical cross sections and supporting calculations were dependent to some extent on the design chosen. In these instances there were departures from the model solution, requiring the marker to assess the calculations considered equivalent to the provided solution. For other elements in the schema, the marking task was one of identifying relevant work in the exam script and the conscientious application of marking instructions. For example, 4 marks were allocated if the drawing included 'machine specifications on the bore, top, spigot and base (one mark each)'.
After completion of the marking of their own scripts ('self assessment') they were then given another student's scrips to mark ('peer assessment'). At each stage, care was taken to ensure that no notation (ticks, question marks etc) was made on any examination paper. At the time of carrying out this task students were not aware of the mark which had been determined by staff marking.
During second session, after they had received the results of their session 1 examination (and were able to compare marks derived from lecturer, self and peer assessments), students were asked to complete a questionnaire which sought information on their experience of conducting self and peer assessment of examination scripts.
Within the current literature on self and peer assessment there is considerable concern for determining the extent to which these forms of assessment can be utilised as reliable and valid indicators of student performance; and the conditions under which such assessments can be suitably employed.
The second issue on which results are presented relates to whether the procedures used are both acceptable to students and provide learning benefits which justify their use. Results derived from questionnaire analysis are used to address this issue.
N | Mean | Standard deviation | |
Lecturer mark | 87 | 54.6 | 15.0 |
Student self mark | 85 | 54.1 | 12.1 |
Peer mark | 81 | 49.3 | 13.3 |
Further inspection of results indicated that the restriction in range of marks awarded for self assessments resulted from a tendency for students with high marks from the lecturer to give themselves a lower self-mark, and students with low marks from the lecturer to award a higher self- mark. This phenomenon is illustrated in table 2 in which students are categorised into quartiles based on the scores obtained from lecturer marks.
Quartile Group | Number in Quartile | Lecturer Mark (Mean) | Self Mark (Mean) | Difference of Means (L-S) |
19-41 marks | 21 | 34.4 | 39.8 | -5.7 |
42-56 marks | 20 | 49.8 | 51.4 | -1.6 |
57-65 marks | 22 | 60.5 | 59.0 | +1.5 |
66-86 marks | 21 | 73.2 | 64.3 | +8.9 |
The group in the longest quartile (receiving lecturer marks between 19 and 41) provided self marks which were, on average, 5.7 marks above that awarded by the lecturer. Those in the highest quartile (receiving lecturer marks between 66 and 86) produced self assessments which averaged 8.9 marks below that awarded by the lecturer.
product moment correlation | mean of absolute difference of scores | |
Lecturer- Self | 0.79 | 7.3 |
Lecturer- Peer | 0.81 | 8.2 |
Lecturer - (Self + Peer)* | 0.86 | 7.0 |
Student- Peer | 0.78 | 7.6 |
* 'Self + Peer' refers to the score obtained by averaging self and peer mark. |
Using the lecturer score as a 'benchmark', measures of the accuracy of student self and peer marks can be gauged by determination of the mean absolute differences between marks obtained by self (and peer) and that obtained by the lecturer. As illustrated in table 3, the averaged difference between individual self and lecturer scores was 7. 3 marks; and between peer and lecturer 8. 2 marks. It may be noted that although there is a slightly higher correlation between lecturer and peer marks, peer marks are less 'accurate' because, as indicated in table 1, peers employ a harder marking standard than lecturer or self assessors.
Use of peer mark only would also result in a moderately similar ordering of individual performance (r=. 81), but the actual scores would tend be substantially lower. A compression of the range of scores, similar to that observed for self marking, would also result. The effects of using self or peer marks as the basis for determining pass or fail status in the examination are illustrated in table 4. The criterion for pass is taken as 50%.
Lecturer assessment | ||
Pass | Fail | |
Self assessment Pass Fail | 49 4 | 8 24 |
Peer assessment Pass Fail | 36 15 | 4 26 |
Self + Peer Pass Fail | 40 12 | 4 24 |
If self marks alone were to be used as the determinant of exam results, then 8 students who had received a lecturer grading of less than 50% would have passed; and 4 students who received lecturer gradings exceeding 50% would have failed.
If peer marks alone were to be used, 4 would pass who the lecturer failed, and 15 would fail who the lecturer passed.
A combined mark, averaging peer and self mark for each individual, would result in 4 passing who the lecturer failed; and 12 failing whom the lecturer passed. The combined mark was also found to be more highly correlated with lecturer mark (r =. 86) than that obtained by either peer or self score marks with lecturer mark.
'Agree' | |
The ability to assess my own work is very important | 91% |
The idea of self-assessment is a good one | 82% |
We should have more opportunities for self assessment | 65% |
Students should be more involved in assessing other students | 50% |
There is strong endorsement supporting the importance of being able to assess one's own work, and the idea of using self assessment. Half of the students also supported more involvement in assessing other students.
The endorsements by 73% of students that participation had 'assisted in making a realistic assessment of my own abilities in the subject'; and by 84% that it 'had made me more aware of what I need to know in the subject' raise the issue of the nature and detail of what was learnt and how they benefited through engagement in the exercise.
'Agree' | |
I found assessing my own work to be valuable | 83% |
I found assessing another students work valuable | 64% |
This exercise assisted me in making a realistic assessment of my own abilities in this subject | 73% |
This exercise has made me more aware of what I need to know in this subject | 84% |
I would like to see some changes made in the procedures used in the exercise | 35% |
I found it difficult to use the marking scheme | 13% |
I found it difficult to follow the model answers | 18% |
I don't think the rewards were sufficient for the amount of time I spent | 29% |
The whole exercise of self and peer marking was a waste of time | 4% |
Substantial benefit | Some benefit | Litte or no benefit | |
Improving my understanding of the subject matter | 24% | 53% | 18% |
Improving exam performance through developing an understanding of what examiners look for in answers | 64% | 31% | 2% |
Developing my ability to critically assess my own work later as a practising engineer | 28% | 51% | 18% |
Developing my ability to assess the work of a colleague when I become a practising engineer | 20% | 53% | 21% |
Although the majority of students indicate some level of benefit on the four aspects included in table 7, the predominant view is that substantial benefit accrued to 'improving exam performance through developing an understanding of what examiners look for in answers'.
From the 91 questionnaires responses, 73 (80%) included written responses to this question. These were content analysed and grouped into four main categories.
Content specific: The most frequent response category related to learning which was specific to the content of the examination itself. Examples of students' descriptions are illustrated:
'I didn't look at the bending of the beam; didn't properly assess the mechanical requirements of the structure, such as wall thickness etc 'In the above category 36 students described learning in terms which were specific to the content of the examination problem, although the majority (21 students) also described learning that had occurred at a more general level.'A serious mistake was disregarding the residual component of force caused by the weight of the rotating body. My design was too bulky and should have been lighter'
Approach: 28 students described their learning in terms of identifying deficiencies in their approach to solving design problems of the kind set in the examination.
'Learnt about deficiencies in making assumptions for the calculations and made inefficient approaches for calculation of the design'Design skills/Problem solving skills: 19 students identified deficiencies in different skill areas related to analysis or drawing. Some examples include:'I was lacking in knowledge of why I drew that certain design. Looking at the overall picture and planning to satisfy all the requirements'
'I should have spent more time doing scale sketches to have a better idea of what would be a better design'
'Not daring enough, suppressed thoughts'
'I need to do more work, examples, in determining what forces and stresses are involved in certain design problems'A further 9 students also mentioned they had developed an appreciation of the need to reallocate the time spent on different aspects of design solution and drawing:'I found problems in spacing my drawings, locating circles, and obtaining even firmness of lines'
Expectations: 26 students mentioned that the exercise had resulted in their developing a better understanding of what is expected by examiners in order to achieve high marks.
'I found that the material that I provided in the examination was not the material that the examiners wanted to see''I have a better appreciation of the detail needed and what the examiners are looking for with respect to ways of dimensioning and practicability'
Within the context of the situation analysed in the current study, it is apparent that considerable agreement exists between the three sets of marking procedures. The correlation of 0. 86 between lecturer marks and the combined average of peer and self marks is quite high, and perhaps more than would normally be obtained from repeat marking of examinations by academic staff. In large part, the nature of the exercise and the employment of a detailed marking protocol have provided a context in which a high level of confidence can be placed in the validity of students' assessments. This is seen as a particularly signficant finding since its applicability to assessments of the kind reported here - of design problems of an open ended quality - represents an extension from that previously reported.
This opens up the prospect of extending the application of self and peer marking procedures to provide a complete basis for assessment, and thereby obviating the need for staff to mark every script. However, since in the present study, thc students' assessments did not count formally as part of their examination result, some safeguards may need to be built into the procedures to ensure that the self marking system maintains its reliability and accuracy. The procedures and checks built into the self and peer marking scheme used by Boud & Holmes (1981 ) in Electrical Engineering on our own campus indicate a practical mechanism for achieving this.
The descriptions by students provide clear testimony that most students had learnt in ways which reach beyond that of knowledge of the solution to the examination question and identification of the source of their own errors. Indeed, a majority of respondents had identified either deficiencies in particular skills relevant to producing engineering design, or had identified ways in which their approach to problem solving in design needed to be altered. Also, a substantial number had come to a clearer appreciation of what was expected to achieve satisfactory or superior performance in examinations.
The information we have on what students have learnt from the exercise can indicate no more than a step, and we believe an important first step, towards the development of self evaluation skills important to professional practice. In our opinion, effective development would require the systematic use of self and peer assessment throughout the students' whole course of study; its regular incorporation into formal assessment; and active collaboration between students and staff in determining the criteria for different assessments. Nonetheless, a successful first step has been taken.
Although the mark analyses indicate that a combination of self and peer assessments can provide a reliable alternative to lecturer marked scripts, we believe such an innovation would need cautious development supported by further studies. The introduction of self and peer assessment was premised on the conception of the importance of students developing the ability to assess their own design work. It was our belief that the experience of engagement in self and peer assessment would provide a modest, but nonetheless important, contribution to this. The results obtained support this belief.
Boud, D. & Holmes, W. H. (1981). Self and peer marking in an undergraduate engineering course. IEEE Transactions, E-24, 4, 267-274.
Boud, D. & Lublin, J. (1983). Self Assessment in Professional Education: A Report to the Commonwealth Research and Development Committee. Tertiary Education Research Centre (UNSW).
Davis, J. & Rand, D. (1980). Self grading versus instructor grading. Journal of Educational Research, 73, 4, 207-211.
Falchikov, N. (1986). Product comparisons and process benefits of collaborative peer group and self assessments. Assessment and Evaluation in Higher Education, 11, 2, 146-166.
Midgley, D. & Petty, M. (1983). Final Report on the Alumni Association 1982 Survey of Graduate Opinion on General Education. UNSW Alumni Association, Kensington.
Orpen, C. (1982). Student versus lecturer assessment of learning: A research note. Higher Education, 11, 567-572.
Mr Douglas Magin is Senior Education Officer within the Tertiary Education Research Centre at the University of New South Wales. His major work has been within curriculum innovation and evaluation in higher education.
Dr Alex Churches is Senior Lecturer in the School of Mechanical and Industrial Engineering at the University of New South Wales. He has published widely in engineering education, with special interest in innovations in engineering design courses. Please cite as: Magin, D. J. and Churches, A. E. (1988). What do students learn from self and peer assessment? In J. Steele and J. G. Hedberg (Eds), Designing for Learning in Industry and Education, 224-233. Proceedings of EdTech'88. Canberra: AJET Publications. http://www.aset.org.au/confs/edtech88/magin.html |