[ ASET ]
[ Proceedings Contents ] |
The automated assessment of authentic Information Technology skills relies on testing candidate's output for conformity with the operations which should have been performed during the test. Since a test is unique, test specifications are unique to each test. This paper describes transformations which can be applied to the instructions given to the candidate to generate the test specifications which drive automated IT skills assessors.
There are many different levels of assessment of IT skills from the most basic function tests, for example, testing that the candidate knows how to use single functions of a particular IT tool, such as selecting a menu item, to authentic assessment (Fletcher, 1992) where the candidate undergoes a test which mimics a typical task which might be undertaken in the workplace, such as generating a memo or querying a database. Function tests are relatively simple to automate and are frequently used for formative assessment in training packages, for example, (SkillCheck, 2002). Authentic assessment tasks are much more difficult to assess mainly because the tasks are at a much higher level which allows the candidate much more flexibility in producing the answer. In fact the major problem with generating an assessor for authentic assessment is the ability to cope with all possible errors a candidate may make, that is, the assessor must be able to 'correctly' assess all possible candidate input. Here 'correctly' means to produce the same result as would be produced by an 'ideal' human examiner.
The model of assessment used here (Dowsing et al., 1999) consists of three phases.
The candidate is asked to insert the text "red" before "kangaroo" in the sentence " One fine day Jane took her kangaroo for a walk."
The model answer would be "One fine day Jane took her red kangaroo for a walk".
In the matching phase, phase 1, the candidate attempt would be matched to this string. Assume that the candidate string was "One fine day Jane took her grey kangaroo for a walk". The matching process would show "red" unmatched in the model and "grey" unmatched in the candidate attempt.
In phase 2 errors in phase 1 are classified. Assume one of the assessment criteria is "Failure to insert text as specified is counted as 1 text error". In order to apply this rule the position in the text to be inserted in the model answer has to be noted, together with the text. The matching process generates the correspondence between items in the model and candidate answers thus allowing phase 2 to relate positions in the model to positions in the candidate answer. The classification test for this example is to apply the test for the correct insertion of the word "red" at the position immediately before "kangaroo" in the candidate answer. This is coded, for example, as [Apply criteria test n] [position x] [text], which is generic so that different examination papers can test different sets of tasks.
One of the inputs to phase two of the assessment consists of a set of specifications detailing what criteria tests to apply to the candidate answer corresponding to given regions of the model answer. Traditionally, such information is generated by a human being - the examiner - and used by human markers to assess candidate attempts. However such generation is error prone and thus some means of automating the generation of this information would help reduce assessment errors.
These transformations are transformations required irrespective of the assessment model used. The addressing of regions given in the instructions to the candidate refer to the current state of the document whereas the test specifications refer to the saved state of the document. Thus if a stage of an examination contains insert or delete instructions, instructions before these operations will need to have their region addresses adjusted in the corresponding test specification.
Example: Consider the following fragment of a database test
The saved file will contain the edited record as record 14 and thus the assessment must be performed on this record rather than record 15 as given in the instruction.
These transformations depend on the model of assessment employed. Some assessment schemes assess instructions as late as possible in a multi-stage test to allow the candidate to correct any mistakes as late as possible. This means that the ordering of the test specifications can be completely different from the instructions.
Example: Consider the fragment of instructions below:
Place the number 2 into spreadsheet location A4In this example file B will contain the desired values in cells A4 and B3 and thus the value in both of these cells can be tested in file B, that is, the order of operations in testing is different to that in the original exercise. In fact, in this case, the saving to file A is not required since all the information is available in file B. The rule for moving an operation across a save operation is that an operation can be moved provided the result of the operation is visible at the next save operation. Since IT tools use overwriting semantics, this rule means that as long as there is no operation on the same region in the following stage, the operation can be moved after the next save operation. This rule can actually be refined since some operations on a region conflict - overwrite a property or attribute of the item - whereas others do not. For example, the instructions referring to A4 in
Save the state to file A
Place the string "abcd" in spreadsheet location B3
Save the state to file B
Place the number 2 into cell A4do not conflict since operations on values and format are orthogonal. Hence the testing of the value and formatting can both be performed on file B.
Save the state to file A
Format the contents of cell A4 as bold
Save the state to file B
Another model dependent transformation required in the particular tests investigated concerns replicate instructions. Such instructions ask the candidate to insert a formula into a cell and then to replicate that formula across the row or down the column. The test for replication has to be performed before the test for the original formula because of the technique used to remove knock-on or dependency errors (Blaney, 1995). A formula in a cell may give an incorrect value if any of the dependent cells contain incorrect values. The normal assessment rule is that errors should only be penalised once, that is, knock-on errors should not be penalised. One method of ensuring this is to assess independent cells first and to change incorrect values to correct values after assessment. By this means dependent cells will always see correct values in the cells they depend on and any errors in the dependent cell are due to an incorrect formula. For the replicate action, the replication must be tested before the formula insertion since if the formula value is incorrect it will be replaced by the correct value and this is not the formula which the candidate has replicated. Thus any replicate operations have to be reordered in the test specifications.
In addition to the assessor independent transformations outlined above there may be transformations required which are assessor specific. For example, specific data needed for the assessor checks can be included in the test specifications. Alternatively, that information is also available in the model answer. The instructions in the exercise contain specific data and if the assessor uses information in the model answer then this data has to be removed from the test specification.
Examples of other assessor specific transformations include formatting of the specifications and adding parameterless tests which are always required, for example, tests for visibility of all information on a spreadsheet by checking column widths.
The individual instructions given state the operations to perform on a range of cells in the spreadsheet. To discover whether there is any overlap between the range in one instruction and another involves performing set intersection on the two ranges. Whilst this is conceptually simple, the implementation can become complicated since the two ranges can not only fully overlap or be disjoint but may also partially overlap in many different ways. It is easier to convert each instruction on a range of cells into a set of operations on single cells. The ranges of instructions now completely overlap or are disjoint. This is used to perform the model dependent transformations described above.
In the first phase of the transformation each instruction on a range of cells is converted into a set of specifications on single cells. At the same time the position of insert and delete instructions is determined and these instructions are decoded to determine whether they apply to rows or columns and to which row or column. This information is used in the second pass through the data to adjust cell addresses as the delete and insert instructions are moved to fixed positions in the test specification stream.
In the second phase, a copy of the insert test is added to the beginning of the first save section since the assessment of this examination penalises candidates who perform the insert too early. The insert specification in the second section is moved to the front of the section and the delete specification is moved to the end with consequent adjustment to the addresses of all the instructions in this section. The reason for this is that in this configuration the addresses of cells in section 1 and section 2 are directly comparable, that is, information will reside at the same cell address in both sections and hence overlap comparison is straightforward.
In the third phase of the algorithm each individual test specification in section 2 is compared with every specifications in section 1. If the section 2 specification can overwrite a section 1 specification then the section 1 specification is marked as unmoveable, otherwise it is marked as moveable.
In phase four the individual specifications are moved to their required position. Firstly, those section 1 specifications which are concerned with format are moved to the front of section 1, followed by the rest of the specifications which cannot be moved from section 1. All the other specifications go in section 2. The format specifications which can be moved from section 1 and those from section 2 are moved to the front of section 2. Next the replicate specifications are moved after the format specifications and lastly the remainder of the specifications which have not been dealt with are moved to the end of section 2.
As a final operation, the delete specification is moved from the end of section 2 to the front of section 2, with the appropriate adjustments being made to the ranges of the specifications traversed. The reason for this movement is that the output saved in section 2 has the delete operation performed so specifications in section 2 need to assume that the delete operation is performed first.
Spreadsheet Examination Paper
NAME MAY INCOME JUNE INCOME TOTAL INCOME | [Assessment Objective 2a] |
Margaret Jones Philip Long John Smith | 1047 234 1256 | 245 165 2322 | [Assessment Objective 2a] |
[Assessment Objective 3a] |
[Assessment Objective 3b] |
[Assessment Objective 4a] |
[Assessment Objective 2c] |
For Margaret Jones For John Smith | 1673 231 |
Ensure that the formulae for TOTAL INCOME are updated to include the new figures.
[Assessment Objective 2b, 2a, 3c] |
Coded Paper: Results of transformation to testing specification | |||
Section 1 | Section 2 | Section 1 | Section 2 |
"A1" "2a" | "A1:D1" "4a" | "D1","2b" | "D1","2b" |
"NAME" | 0 | "A3","2c" | "A3","2c" |
"B1" "2a" | "A3" "2c" | "E3","3b" | "A1","4a" |
"MAY INCOME" | "D1" "2b" | "E4","3b" | "B1","4a" |
"C1" "2a" | "D1" "2a" | "A3","2a" | "C1","4a" |
"JUNE INCOME" | "JULY INCOME" | "B3","2a" | "E1","4a" |
"D1" "2a" | "D2" "2a" | "C3","2a" | "E3","3b" |
"TOTAL INCOME" | 1673 | "E2","3a" | "E2" |
"A2" "2a" | "D3" "2a" | "E4" "3b" | |
"Margaret Jones" | 231 | "E2" | |
"A3" "2a" | "E2" "3a" | "A1","2a" | |
"Philip Long" | "=SUM(B2:D2)" | "B1","2a" | |
"A4" "2a" | "E3" "3b" | "C1","2a" | |
"John Smith" | "E2" | "E1","2a" | |
"B2" "2a" | "A2","2a" | ||
1047 | "A3","2a" | ||
"B3" "2a" | "B2","2a" | ||
234 | "B3","2a" | ||
"B4" "2a" | "C2","2a" | ||
1256 | "C3","2a" | ||
"C2" "2a" | "D1","2a" | ||
245 | "D2","2a" | ||
"C3" "2a" | "D3","2a" | ||
165 | "E2","3a" | ||
"C4" "2a" | |||
2322 | |||
"D2" "3a" | |||
"=SUM(B2:C2)" | |||
"D3:D4" "3b" | |||
"D2" |
CLAIT (1998). Tutor's Handbook and Syllabus, 3rd Edition, L706, OCR, Coventry, October.
Dowsing, R.D., Long, S. and Sleep, M.R. (1999). Assessing word processing skills by computer. Information Service and Use. IOS Press, Amsterdam, 15-24.
Dowsing, R.D., Long, S. & Craven, P. (2000). Electronic delivery and authentic assessment of IT skills across the Internet. International Conference on Advances in the Infrastructure for e-business, Science and Education on the Internet, SSGRR, L'Aquila, Proceedings on CD-ROM.
Fletcher, S. (1992). Competence-based Assessment Techniques. Kogan Page, London.
Kennedy, G.J. (1999). Automated scoring of practical tests in an introductory course in information technology. Computers and Advanced Technology in Education (CATE99), Cherry Hill, New Jersey, USA, IASTED/Acta Press.
SkillCheck, HR Press Software. http://www.individualsoftware.com/ [verified 13 Aug 2002]
Authors: Roy D. Dowsing and S. Long, School of Information Systems, University of East Anglia, Norwich NR4 7TJ, UK. Email: rdd@sys.uea.ac.uk
Dr Roy Dowsing has been a Senior Lecturer in the School of Information Systems at the University of East Anglia since 1979. In the last ten years his research interests have been almost exclusively in the field of automated assessment of IT skills, funded by the Higher Education Funding Councils and a major UK Examinations Board.
|