[ ASCILITE ]
[ 2004 Proceedings Contents ] |
This paper explores possibilities for utilising student assignment feedback, written by academics and stored in an online marking and results system called OMAR at the University of Melbourne, for quality assurance of feedback and to provide professional development for academics. The paper draws upon the results of a survey of staff and students using OMAR in 2003, which suggested that the quality of feedback is related to the individual academic's commitment to the task more so than any assistive technology (McKenzie, 2003b), and while this is intuitive it is yet to be tested. The mere thought of assessing the quality of feedback given to students via data and text mining of online marking databases has already taken some academics beyond their traditional comfort zones for performance appraisal. The possibilities raise legal and ethical issues in relation to privacy, consent, accountability and appropriate uses of the data that might be generated. On the other hand, taking advantage of analysing such databases could present a better means of providing feedback to academics on one of their most common tasks, which would benefit both academics and students.
New academics, whether they are first time lecturers or tutors, may never have written assessment feedback before. Learning to write useful and appropriate feedback for students has traditionally been an apprenticeship process, with staff learning from more experienced teachers in their discipline. This can be a hit and miss process; as training aimed specifically at workshopping written feedback is in general unavailable. Increasingly, quality assurance of assessment, including feedback, is important for universities. Which begs the question, "How do we know if the feedback provided by academics to students is effective and helps them learn?"
This paper explores some possibilities of utilising student assignment feedback, written by academics and stored in an online marking and results system, called OMAR and developed at the University of Melbourne, for quality assurance of feedback and to provide feedback and professional development to academics.
This can sound like a tall order for academics, in light of increasing administrative burdens and demands on their time. Especially, when it has been suggested recently that students do not read or assimilate all the feedback provided by academics on their work (Graham Gibbs cited in Mills, 2004). Kennedy and Judd's (2004) analysis of audit trails in multimedia software also suggests that students do not use feedback as one would expect. This all suggests there is a fine balance to be found between too much, too little or poorly focused feedback, and it is worthwhile not just to the student, but for the academic that they get this balance right.
Since the first prototype was developed in late 2000, OMAR has been used by eighty subjects in the Arts and Vet Science faculties, by 179 staff and just over 4,500 students in 111 semester cohorts. The latest, daily usage statistics are available from http://www.omar.unimelb.edu.au/docs/usage.html. At the time of writing this paper, OMAR stored just over 12,000 assessments for 239 separate tasks assigned to students.
The ideology behind my development of OMAR was that "machines can do the work, so that people have time to think" (B(if)Tek, 2000). In this sense, "work" meant stapling feedback sheets to assignments, calculating grade distributions, alphabetising stacks of essays and returning the assignments to students, and "thinking" concerned providing effective feedback to students. If the routine administration could be streamlined it would provide more time to consider the students' work, and this has been the author's experience of the system thus far.
OMAR provides many different ways academics can provide feedback to students in their marking templates. From defining criteria and then assigning either marks or using Likert scales, writing comments, and returning files to the students, whether these are additional resources for the whole class or a student's electronic submission, which can be marked up using other software, such as a word processor. Further details can be found in the online User Guide, which is available publicly at http://www.omar.unimelb.edu.au/docs/.
OMAR was designed so that examiners can view the feedback written by their colleagues earlier than in the traditional marking process. Traditionally, examiners meet up after they have completed all their marking to do cross marking, when adjusting comments or grades means twice the work and so not all student assessments are cross marked. OMAR promotes earlier cross marking and consistency as examiners can access the assessments written by others teaching on their subject as soon as they are written into the system, and can generate on the fly statistics about their grade distribution. This also provides an avenue for new academics to learn from their peers about how to write appropriate and useful feedback.
This would seem to be commonsense, indicating that an academic's commitment to the task of providing feedback is more important than any characteristic of an online marking system. However, is it an accurate observation? Does using an online marking system make no difference to the quality of feedback provided to students? Examiners need to fill in a marking template, which might provide and enforce a structure not used previously in their marking. This by itself might promote improved consistency in the type and quality of feedback provided to students. Or, there is also the potential that online marking could sever the direct link between comments and student work if the system is not used sensibly, making feedback less useful to students (McKenzie, 2003a).
In the 2003 survey, some students indicated they had received more feedback in the assessments they received through OMAR:
Excellent - same or more feedback than usual... the amount of feedback OMAR have provided is adequate (marking with fair, poor, etc) so that we know where the assignment's lacking off, and we can still have comments from our tutor. This is really good since the manual feedback often not in such great detail
I got more feedback in my OMAR subject than in my other subjects.
However, others complained they had received less feedback:
... feedback is poor compared with comments written on the essay. much more difficult to relate the comments to the essay, particularly wrt grammer or punctuation or even general organisation. overall comments using omar tend to be briefer and less helpful. it surely must be easier to mark the essay itself rather than take the additional step of entering assessment into omar.
Again, on the amount of feedback provided, it was noted that:
Unfortunately it is dependant on the staff who wish to really use the opportunity to give complete feedback that the system does work. Some do not make use of returning the essay with full comments which I find would be a great advantage to student who wish to improve. Perhaps it could be suggested to staff to always utilise this facility and not just the comments section that appears with the marks page.
For staff, their experiences of online marking were positive overall:
I like being able to read other's comments. It gives me ideas for how to provide tactful feedback. It also gives me an idea of how I am travelling in comparison with other markers. (Tutor)A great means of cross checking for consistency (Full time academic)
The benefits to staff extended to easier identification of students with learning difficulties:
Greatly improved identifying students with learning problems by making other markers opinions accessible to the group as a whole. (Full time academic)
Given the sizeable database of student assessments stored in OMAR, there is an opportunity to test this empirically in comparison to feedback sheets not written using OMAR. What are the characteristics of the feedback provided using OMAR? Do staff using OMAR provide more or less feedback of better or worse quality? Are their comments more or less developmental or judgmental? Does the feedback make links to the course material and learning objectives? Or are the comments generic? The following sections discuss the possibilities of undertaking such empirical analysis of the OMAR database to answer these questions.
Ideally, in the long term, analysis of feedback to students would be automated, given the large numbers of students and assignments per subject. The aim is not to create more work for academics. One possibility would be to use knowledge discovery and data mining techniques upon the database to analyse marking template characteristics (Brankovic and Estivill-Castro, 1999). The classification process in data mining works on the principle of inputting a training set, that is a set of example cases and their classes, and outputting a classifier that will assign classes to new cases (Brankovic and Estivill-Castro, 1999). Once data has been classified it can be clustered and have predictive modelling performed on it. How could a training set be constructed for the OMAR database? They suggest judgements would either need to be made about existing feedback or on dummy assessments. A panel of senior academics or educational experts could make these judgments.
The analysis of the feedback written to students in comment items on the templates could perhaps be analysed using text mining. The Wikipedia Encyclopaedia defines text mining, also known as intelligent text analysis, text data mining and knowledge discovery in text, as:
... the process of extracting interesting and nontrivial information and knowledge from unstructured text. Text mining is a young interdisciplinary field, which draws on information retrieval, data mining, machine learning, statistics, and computational linguistics. As most information (over 80 percent) is stored as text, text mining is believed to have high commercial potential value.
Some options available now include using SAS Text miner or Predictive Text Analytics from SPSS. The promise is that text mining can deal with unstructured data and will generate knowledge the academic did not think to ask for initially. However, these tools rely upon a skilled expert analysis of the findings, due to the ambiguous nature of language and discipline specific information that could be contained in the texts (Robb, 2004). For the purposes suggested in this paper, the text mining process would need to be incorporated as background functionality of the OMAR software.
Another possibility for achieving more timely, systematic feedback for the academic would be to facilitate student rating of the feedback they receive for an assignment through OMAR using a poll, as is found increasingly at the end of commercial software support web documents, which ask, "How useful was this document to answering your question?" Students could be asked, "How useful was this feedback to improving your learning?" or similar. Consideration might need to be given to diverting the student from their actual grade to focus on the issue of quality feedback.
Individualised feedback about their online marking could then be provided through the OMAR interface to the academic, describing what they do and directing them to advice on best practice assessment techniques if needed, such as teaching and learning resources cited above and advice on constructing better marking templates.
Universities need to be accountable for their processes of assessment. Individual academics who coordinate subjects bear the everyday responsibility for the assessment of their students, regardless of whether they delegate marking to tutors. Quality assurance of feedback to students needs to occur more regularly than just for new staff, as suggested by the opening scenario to this paper. The potential benefits of analysing the OMAR database includes feedback to academics allowing them to seek professional development or a sense of satisfaction knowing their feedback is of quality, improved student learning and hence student satisfaction with their courses, which could translate into increased enrolments and revenue for universities.
Fox (2001: 11) argues that the public regards the current level of surveillance and dataveillance as essentially benign due to its fragmented, decentralised and distributed nature between the public and private sectors. The fear of a Big Brother behind it all seems unfounded. However, workplace monitoring of performance is a hot issue in the Australian private sector, and this paper is exploring a form of it for academics. Resistance to over surveillance is framed generally in terms of privacy issues (Fox, 2001), and the counter argument is based upon legitimacy. Brankovic and Estivill-Castro (1999) identified several threats from mining databases for knowledge discovery that should be considered, such as stereotyping, generation of misinformation, and breach of privacy through disclosure or inappropriate combination of results with other results. They suggest that researchers undertaking data mining analyses should have need to know access to individual data (Brankovic and Estivill-Castro, 1999: 94).
Universities might be entitled to use and analyse assessments written by staff depending upon their intellectual property regulations. To be ethical, it is suggested each staff member give that informed consent before analysis takes place. Further, under the Information Privacy Act 2000 (Vic) and the National Privacy Principles, information such as student grades and feedback would be considered personal information, although not sensitive information for the purposes of the Acts. The analysis of such information by a university for the purpose of quality assurance and professional development would most likely constitute a related secondary purpose, and so be permissible.
Analysing the feedback simply in terms of frequency of positive or negative comments is insufficient, as the need to point out faults with the students' work will be dependent upon the quality of that work. One might be tempted to address this issue as whether the quality of an academic's comments reflect the quality of the student work. From the author's experience of reading his own and others' feedback sheets, there is sometimes a tendency to provide more feedback for students who have failed than for students who have performed well. In one sense, Christiansen (2004) identified this as academics covering themselves against comebacks by disgruntled students, however it could also point to difficulties in providing suggestions for improvement to high achievers. High achieving students often need feedback as much as those who have performed less well, and there is also the need to justify higher grades.
Whether and how the results of such analyses should be used in relation to performance appraisal of academics is controversial. At my university, quality of teaching surveys of students are not supposed to be used for performance appraisal due to the subjective nature of the anonymous feedback. Would analysis of academic feedback to students by text mining be more objective? Software designers would need to be accountable for any software processes used to make these analyses, similar to those tests used increasingly to judge the validity of computer forensic software and digital evidence by the courts (Casey, 2001). Would we, as users of this technology, know how the information was generated and be able to verify its accuracy?
Quality of teaching results are often communicated to students, but only in aggregate, statistical form. Should the results of analysing academic feedback be communicated to students? Again, probably only in aggregate form to protect the privacy of students, if they were to be disclosed at all. It would be unethical to classify extracts from the database and then quote these to other staff and students as examples of either good or poor practice, where this would identify the author or the student involved.
B(if)tek (2000). Machines work, 2020, CD recording. Melbourne: Murmur records. MATTCD105.
Brankovic, L. and Estivill-Castro, V. (1999). Privacy issues in knowledge discovery and data mining. In C. Simpson (Ed), Flow on effects of Information Technology on quality of life and the environment. Proceedings of the first Australian Institute of Computer Ethics conference. [verified 21 Oct 2004] http://crpit.com/confpapers/CRPITV1Wahlstrom.pdf
Casey, E. (2001). Handbook of Computer Crime Investigation, New York: Elsevier.
Christiansen, R. (2004). Critical discourse analysis and academic literacies: My encounters with student writing. The Writing Instructor. [verified 21 Oct 2004] http://www.writinginstructor.com/essays/christiansen-all.html
Fox, R.G., (2001). Someone to watch over us: Back to the Panopticon?, Criminal Justice, 1(3), 251-277. [verified 21 Oct 2004] http://crj.sagepub.com/cgi/framedreprint/1/3/251
Gibbs, G. (1999). Improving teaching, learning and assessment, Journal of Geography in Higher Education, 23(2) July, 147-155.
James, R. (1994). Assessment. http://www.cshe.unimelb.edu.au/downloads/assessment_rev2.pdf
James, R., McInnis, C. and Devlin, M. (2002). Assessing Learning in Australian Universities: Ideas strategies and resources for quality in student assessment. Centre for the Study of Higher Education, University of Melbourne. [verified 21 Oct 2004] http://www.cshe.unimelb.edu.au/assessinglearning/
Kennedy, G.E. & Judd, T.S. (2004). Making sense of audit trail data. Australasian Journal of Educational Technology, 20(1), 18-32. http://www.ascilite.org.au/ajet/ajet20/kennedy.html
McKenzie, S. (2003). OMAR: Improving the Online Marking Process. Invited paper and poster presented to the Multimedia & Educational Technologies for Teaching and Learning Enhancement (METTLE) conference, University of Melbourne, 5 November.
McKenzie, S. (2003). Results from the Feedback survey of Staff and Student Users of OMAR 2003. [verified 21 Oct 2004] http://www.omar.unimelb.edu.au/docs/surveyresults.html
Mills, R. (2004). Learner support: Developments in open and distance education and their implications for traditional educational institutions. Paper presented to The Open and Distance Learning Association of Australia's Professional Development Seminar series, University of Melbourne, 2 July.
Robb, D. (2004). Taming Text. Computerworld, June 21, 40-41.
Sommers, N. (1982). Responding to student writing. College Composition and Communication, 33, 148-56.
Straub, R. and Lunsford, R.F. (1995). Twelve Readers Reading: Responding to College Student Writing. Cresskill, NJ: Hampton Press.
Wikipedia Encyclopaedia (nd). Text Mining. [verified 21 Oct 2004] http://en.wikipedia.org/wiki/Text_mining
Author: Mr Shane McKenzie, Lecturer, Department of Criminology, The University of Melbourne VIC 3010 Email: shaneem@unimelb.edu.au Web: www.crim.unimelb.edu.au/staff/shaneem.html
Please cite as: McKenzie, S. (2004). Assessing quality of feedback in online marking databases: An opportunity for academic professional development or just Big Brother?. In R. Atkinson, C. McBeath, D. Jonas-Dwyer & R. Phillips (Eds), Beyond the comfort zone: Proceedings of the 21st ASCILITE Conference (pp. 623-628). Perth, 5-8 December. http://www.ascilite.org.au/conferences/perth04/procs/mckenzie.html |
© 2004 Shane McKenzie
The author assigns to ASCILITE and educational non-profit institutions a non-exclusive licence to use this document for personal use and in courses of instruction provided that the article is used in full and this copyright statement is reproduced. The author also grants a non-exclusive licence to ASCILITE to publish this document on the ASCILITE web site (including any mirror or archival sites that may be developed) and in printed form within the ASCILITE 2004 Conference Proceedings. Any other usage is prohibited without the express permission of the author.