SMART, an Explorapaedia of Statistical and Mathematical Advanced Research Techniques.

G.R. Bishop

Statistics Department,

University of Adelaide,

Australia 5005

E-mail: gbishop@stats.adelaide.edu.au

M.Talbot

Biomathematics and Statistics Scotland,

JCMB, Kingís Buildings, Edinburgh, UK

Email: mike@bioss.sari.ac.uk

Keywords
World Wide Web, statistics, training, multimedia, communications

Abstract

SMART is a means of providing timely and cost effective training to users of statistics and mathematics, such as postgraduate students in the biological and social sciences. We have developed a framework, for the World Wide Web, that enables the user to explore a library of modern quantitative methods. Each module in the library describes one method, with an application, real data and an opportunity to experiment with an analysis using appropriate software. Built-in comment and E-mail forms facilitate connections between users and module authors. We are making further connections by encouraging other module authors to use our framework.

1. Introduction

Our experiences with statistical consulting and providing statistical training courses for researchers in the biological sciences indicate that there is a need for summary information about recent statistical developments.

Biological and social scientists, who may want to use the latest statistical methods, need information about them, but not usually at the detailed mathematical level that appears in statistical journals. The scientist may have seen mention of a method of data analysis in the scientific literature and want to know more about it.

The options available to the scientist are to consult a statistician or to attend a training course. A statistical consultant, well-versed in the topic, may not be available. Attendance at a course requires time, often inconvenient, money for the course fee, often substantial, and a match between the scientist's requirements and the course material.

It would be more convenient if the scientist had somewhere to go for a first look at the method, with just the right amount of detail and graded references.

Our aim has been to develop a framework within which an overview of modern advanced quantitative methods can be produced and delivered in a manner which is timely and cost effective.

This paper describes SMART, an explorapaedia of advanced statistical and mathematical techniques for researchers. In section 2 we discuss the underlying reasons for our approach and what we mean by an explorapaedia. Section 3 deals with the features of SMART while in section 4 we look in detail at a module. In section 5 we briefly discuss evaluation and finally, in section 6, we address the conference theme of making connections in the light of the future we envisage for SMART.

2. Technology and Methodology

Most scientific researchers nowadays have ready access to the World Wide Web and it is a natural first port-of-call when information is required.

Web browsers, such as Netscape, that are used for accessing information on the Web, offer many of the features that are required for computer based learning. The recent addition to the Web of facilities for running application programs across the Internet means that it is also possible to provide for interaction between the user and the application programs.

But the important advantages of Web technology over conventional CBL authoring software are that the former is freely available and it encourages cooperative development, essential in order to get a critical mass of training material widely available.

One of the current limitations of the World Wide Web for training purposes is the relative slowness at which information, in particular sound and video, can be accessed across the Internet. To overcome this deficiency, we envisage that the training material and the associated delivery programs that we produce will be downloaded onto local CD-ROMs.

In setting up SMART we have been aware of the need to maximise learning outcomes. There is a growing body of literature on the teaching of introductory and service statistics courses; see for example Wild (1995) and Cobb (1993). There is less discussion about help for the more advanced user of statistics such as postgraduate students and inexperienced scientific researchers, although Bisgaard (1991) addresses this issue.

Some of the principles discussed in the papers on introductory courses can also be applied to the development of training material.

Bisgaard (1991) says that consultants giving training courses "mostly need (as a minimum) to know what [it is that] the client does not know, but what is essential as a catalyst for solving problems." This principle is incorporated into SMART in two ways. First, the structure described in sections 3 and 4, allows the client to delve as deeply or superficially into a topic as is necessary for his or her needs. This may range from skimming the titles to following up suggested references and trying examples.

Second, we envisage that modules about particular topics will be written by authors who are experienced at explaining the topic, so that they are aware of the pitfalls for novices.

While developing the SMART infrastructure and topic modules we have concentrated on several points that we considered either necessary or desirable in a successful training programme.

ï The need for teaching and training to be contextualised. Moore (1993) pointed out that there was some educational advantage in providing a context for statistical techniques by way of video clips.

ï The importance of providing an overview to get the larger picture before going off to more detailed information

ï The desirability of using several senses. Ferris and Hardaway (1994) discuss the advantages of giving students a complete picture of statistics using multimedia. We have incorporated pictures, graphics and sound to reinforce material and to act as reminders.

ï Providing an opportunity for the trainee to experiment with appropriate software under supervision.

ï The material must be readily available.

3. The Structure of SMART

The SMART home page contains a few introductory comments and the main index of links to useful information about the explorapaedia. Items in the index include how to use SMART, the aims of the modules, hardware and software requirements, application software used in various modules, guidelines for developing modules, contact information and available modules.

3.1 Exploring

The final item on the main index provides a link to an index of available modules. Most users will go straight to this index once they have become familiar with SMART, usually after the first visit. We intend to provide three methods of searching, or exploring, the modules.

The first method is to browse the list of modules and choose one. Currently there are only six modules available and so this is the only exploration method possible at present. When there is a sizeable bank of modules, SMART will incorporate browsing by application discipline such as crop experimentation, plant ecology, food and nutrition and forestry. Users will be able to browse modules that are particularly applicable to their line of research. For instance, those involved in crop experimentation or forestry may be interested in resolvable designs for field experiments (Williams and Matheson, 1994) while those engaged in plant ecology would be more interested in distance sampling techniques (Buckland et al, 1993).

The final method of exploration will be by methodology such as spatial statistics, time series, experimental design and analysis, linear and non-linear modelling and others. These methods are interdisciplinary so that spatial statistics might be of interest in disciplines as diverse as field experimentation, geology and public health.

3.2 The Icon Bar

In order to explore, the user must be able to move easily both within and among modules. A feature of SMART is an icon bar that always appears at the top of the screen. This bar has forward and back arrows for movement within a module in a manner preordained by the author. The Begin icon enables the user to start the module again from any where within the module, while the Contents icon allows the user to view the table of contents and choose a point of entry.

Some slides have sound associated with them and if so, one of the bar icons will display the word ìSoundî. For a variety of reasons, including hardware restrictions, hearing impairment and noise pollution, some users may not be able to use the sound facility. Whenever sound is present in a module there is always an accompanying verbatim text file which may be accessed via a ìTextî icon.

There are two other icons in the bar. ìOther Modulesî allows the user to return to the module index, while ìCommentsî provides a facility for users to provide feedback to the authors.

3.3 Feedback

We have incorporated forms for feedback and these can be accessed in two ways. When the user clicks on the ìCommentsî icon, a form is downloaded from a central Web site while at the end of a module the user is asked to complete a short questionnaire and to make comments. If the user submits a comment, this is stored at the central site together with information to identify the module. Comments can be collated and accessed by authors.

Authors may also provide their Email addresses if they wish to have more direct contact with users.

4. Structure of a Module

Each module addresses a particular statistical or mathematical technique, such as sequential acceptance sampling, analysis of molecular variance or generalised additive models. Here we use the sequential acceptance sampling module, that we have developed, to illustrate module structure.

Each module has a three-tiered structure. The primary level consists of a series of slides, each one approximately a screenful. This level is essentially linear and concentrates on an overview of the technique. It includes a description of the techniqueís usefulness and assumptions, under what circumstances its use would not be appropriate and any other major features. The technique is illustrated with a motivating example including a clear description of the aims, results obtained and conclusions.

For instance, sequential acceptance sampling is presented in the context of seedlot testing. There are photographs to illustrate the laboratory procedures involved in testing and the criteria for determining whether seeds are viable or not. The material is presented in point form. In addition, the user may click on the SOUND button and listen to a spoken explanation.

The primary linear structure of the module consists of a description of seed testing, a brief summary of methods used to determine whether a seed lot should be accepted, details of two types of sequential sampling, a comparison of these methods and more traditional sampling using graphs and diagrams, a summary and finally an opportunity to run the program, SEEDS.

The second level consists of loops added to the primary structure. This level gives explanations of some of the more difficult aspects, e.g. what sort of things should be used to check that the assumptions have been met. It might contain design constraints for a method of analysis. Instructions on how to use software to read, generate or analyse data are in this level.

Links provide the mechanism for moving from level 1 to level 2. The SMART icon bar returns the user to level 1.

Secondary loops are attached to several of the slides in the sequential acceptance module. For instance some of the early slides in the primary structure contain iconified photographs with brief descriptions of the seed testing procedure. One client may wish to skim through this while another may want more details of the context. By clicking on the iconified photograph the user moves to a secondary slide that contains a larger photograph and a spoken or text explanation of the relevant laboratory method. When finished, the client returns to the primary slide by pressing the forward button.

A special purpose computer program, SEEDS, is available for calculating sample sizes using sequential sampling. Access to this program has been incorporated into the sequential sampling module. The last primary slide refers to the SEEDS program, providing a link to a second tier slide which allows the user to input various parameters for running the program. After setting the values, the user clicks ìRunî and control passes to the software. This is an example of the third level in the module structure.

The third level is outside the main structure of the module. It consists of such things as a list of references, a glossary of terms, pages accessed by other modules. This level is accessed by links in the first or second levels. It is necessary to use the browser Back button to return to the higher levels. Other examples of the third level are a list of useful reference publications classified as introductory, theoretical or applied and a glossary that provides an explanation of terms used in the presentation.

5. Evaluation

Before being included in SMART, a module is assessed by two people. Preferably one should be very familiar with the topic in order to make constructive criticism about the content. The other assessor should be less familiar with the topic to detect difficulties with explanations.

One of the advantages of using the Web is that modules may be continually improved and the latest version is always available. Once the module is included in SMART, clients may use the comment facility to ask for clarification of any problems with the material. Clients are also requested to answer a short questionnaire at the end of each module. The comments and questionnaire answers are submitted to a central site from where they are distributed to authors.

6. The Future

Modules that have been developed so far include antedependence sampling, analysis of molecular variance, growth curve modelling, generalised additive models, partial least squares regression and sequential acceptance sampling. Modules in experimental design are planned.

The SMART explorapaedia has already led to connections between the authors of this paper and another author from the University of Ioannina in Greece. We are currently looking in Europe and Australasia for other collaborators to contribute modules in their areas of expertise. Translations of the introductory pages to Spanish, Portuguese, German, French and Dutch are either in preparation or have been completed.

We have restricted our attention to the requirements of biological scientists because that is where most of our experience lies but would welcome collaborators from other spheres.

We foresee further connections being established between authors and users as young scientists become aware of the facilities offered by SMART.

7. References

Bisgaard, S. (1991) Teaching Statistics to engineers. The American Statistician, 45, 274ñ283.

Buckland, S.T., Anderson, D.R., Burnham, K.P. and Laake, J.L. (1993) Distance sampling: estimating abundance of biological populations. Chapman and Hall, London.

Cobb, G.W. (1993) Reconsidering Statistics education: a National Science Foundation conference. Journal of Statistics Education v1, n1.

Ferris, M. and Hardaway, D. (1994) Teacher 2000: a new tool for multimedia, teaching of introductory Business Statistics. Journal of Statistics Education v2, n1.

Moore, David S. (1993), The place of video in new styles of teaching and learning Statistics, The American Statistician, 47, 172-176.

Wild, C.J. (1995) Continuous improvement of teaching: a case study in a large statistics course. International Statistical Review, 63,1:49ñ68.

Williams, E.R. and Matheson, A.C. (1994) Experimental design and analysis for use in tree improvement. CSIRO Information Services.

Copyright

G.R.Bishop and M.Talbot (c) 1996.

The authors assign to ASCILITE and educational and non-profit institutions a non-exclusive licence to use this document for personal use and in courses of instruction provided that the article is used in full and this copyright statement is reproduced. The authors also grant a non-exclusive licence to ASCILITE to publish this document in full on the World Wide Web and on CD-ROM and in printed form with the ASCILITE 96 conference papers, and for the documents to be published on mirrors on the World Wide Web. Any other usage is prohibited without the express permission of the authors.